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CHANGE-POINT  PROBLEMS  IN  REGRESSION 


Hyune-Ju  Kim,  Pb.D. 
Stanford  University,  1988 


This  dissertation  focuses  on  the  problem  of  testing  for  a  change  in  the  regression 
model  when  errors  are  independently,  normally  distributed  with  constant,  known  or  un¬ 
known  variance.  First  we  consider  the  regression  model  in  which  only  the  intercept  changes 
at  some  unknown  point  (Model- 1).  Secondly,  the  model  in  which  both  intercept  and  slope 
change  is  considered  (Model-2).  In  all  cases,  the  likelihood  ratio  statistic  (LRS)  is  of  the 

forth  U  =  maxi<l<m  where  distributions  of  If,-’ s  vary  according  to  the  assumptions. 

HL  " 

"  In  both  models,  wie  considers  the  likelihood  ratio  test  (LRT)  as  the  problem  of  the 
boundary  crossing  by  the  discrete  stochastic  process  and  study  problems  such  as  approx¬ 
imations  to  significance  levels,  powers,  and  confidence  regions  for  a  change  point.  First  of 
all,  we  propose; a  modified  LRT  and  discuss^asymptotic  properties  of  test  statistics  in  cases 
of  random  and  fixed  independent  variables.  In  both  cases'Nvd'Serive^analytical  approxi¬ 
mations  to  significance  levels.  When  the  independent  variables  are  random,  the  limiting 
distribution  of  the  modified  LRS  is  a  function  of  a  Brownian  motion  and  approximations 
in  Siegmund  (1986,  Annals  of  Statistics)  are  used.  For  fixed  independent  variables,  the 
limiting  distribution  involves  a  Gaussian  process  with  nondifferentiable  sample  paths.  In 
this  case,  an  approximation  is  derived  assuming  the  known  variance  and  mild  conditions 
about  the  empirical  distribution  of  the  independent  variable,  using  the  argument  in  Lead- 

better,  Lindgren  and  Rooizen  (198^Chapterl2}imodified  for  discrete  time  by  Hogan  and 

- -  ■-  7 

Siegmund^  1986,  Advances  in  Applied  Mathematics).  In  Model-1,  we  are  also  concerned 

with  the  power  of  the  LRT  and  confidence  regions  for  a  change  pointy 

Numerical  approximations  of  significance  levels  and  powers  of  the  LRT  and  the  results 
of  corresponding  Monte  Carlo  experiments  are  obtained.  We  find  that  the  simulations 
confirm  that  the  theoretical  results  perform  well  and  demonstrate  that  the  results  also 
can  be  applied  to  the  unknown  variance  case. 
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chapter  1 


Introduction 


1.1.  Change-Point  Problems 

In  recent  years  increasing  interest  has  been  shown  in  problems  about  stability  of 
models  for  a  sequence  of  observations.  When  a  series  of  observations  is  taken  sequentially, 
it  can  happen  that  the  whole  set  of  observations  can  be  divided  into  subsets,  each  of  which 
can  be  regarded  as  a  random  sample  from  a  different  distribution.  Assuming  points  at 
which  model  changes  are  unknown,  basically  two  distinct  problems  arise  :  detection  and 
estimation  of  change  points. 

Change-point  problems  originally  arose  in  quality  control  to  detect  changes  in  the 
quality  of  output  from  a  continuous  production  process.  A  process  in  control  maintains 
an  approximately  constant  quality  of  output.  Suppose  that  the  process  jumps  out  of 
control  at  some  unknown  point,  the  quality  worsens  and  the  output  become  unacceptable. 
In  order  to  take  actions  when  such  a  deterioration  is  suspected,  it  is  required  to  signal  any 
departure  of  the  output  from  the  target  value  as  soon  as  possible. 

One  of  the  simplest  examples  is  the  problem  of  detecting  a  single  change  in  the  mean 
of  normal  random  variables  having  known  and  fixed  variance.  Sequential  detection  of  a 
change  in  the  mean  of  the  distribution  of  observations  has  been  studied  by  Page  (1954), 
Shiryayev  (1963),  Lorden  (1971),  and  Poliak  (1985).  For  fixed  sample  problems  involving 
a  finite  sequence  of  observations,  Siegmund  (1986)  gave  an  analytic  approximation  for 
a  significance  level  of  the  likelihood  ratio  test  (LRT)  and  discussed  confidence  sets  for 
a  change  point.  James,  James,  and  Siegmund  (1987)  considered  the  unknown  variance 
case  as  well  as  the  known  variance  case  and  studied  various  tests,  such  as  those  based  on 
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the  likelihood  ratio  and  recursive  residuals.  Also  power  approximations  were  developed 
by  integrating  approximations  for  conditional  boundary  crossing  probabilities. 

Change-point  problems  arise  in  various  ways  and  have  been  considered  in  regression 
models,  time-series  models,  and  survival  analysis.  For  a  change  in  a  binomial  probabil¬ 
ity,  Hinkley  and  Hinkley  (1970)  used  maximum  likelihood  methods  to  estimate  a  change 
point  for  binary  random  variables  and  derived  exact  and  asymptotic  distributions  of  the 
maximum  likelihood  estimator  of  the  change  point.  A  cumulative  sum  test  statistic  for 
this  problem  was  proposed  by  Pettitt  (1980)  and  a  nonparametric  cumulative  sum  statis¬ 
tic  was  applied  to  binomial  random  variables  by  Pettitt  (1979).  An  example  of  this  type 
of  a  change  in  epidemiology  was  described  in  Worsley  (1983),  who  used  the  LRT  to  test 
for  a  change  in  probability  of  a  sequence  of  independent  binomial  random  variables.  He 
also  compared  powers  of  the  LRT  and  the  cumulative  sum  test  and  discussed  the  rela¬ 
tionship  between  the  cumulative  sum  test  and  a  two-sample  Kolmogorov-Smirnov  test. 
Worsley  (1986)  used  maximum  likelihood  methods  to  test  for  a  change  in  a  sequence 
of  independent  random  variables  from  an  exponential  family.  He  found  the  exact  null 
and  alternative  distributions  of  the  test  statistics  using  an  iterative  numerical  procedure. 
Exact  and  approximate  confidence  regions  for  the  change  point  were  given,  based  on  a 
level  o  LRT  and  a  modification  of  the  method  proposed  by  Cox  and  Spijd>tvoll  (1982).  He 
also  discussed  an  application  to  the  data  set  on  the  time  intervals  between  explosions  in 
British  coal  mines  between  1875  and  1950. 

Change-point  problems  in  time-series  models  have  been  considered  in  Picard  (1985) 
who  discussed  applications  to  Canadian  lynx  data,  IBM  common  stock  closing  prices, 
and  German  unemployment  data.  Picard  was  concerned  with  detecting  two  kinds  of 
changes:  first  is  a  change  in  the  spectrum  of  a  time  series;  secondly  she  considered  a 
change  in  the  mean  or  covariance  of  an  autoregressive  process.  Matthews,  Farewell,  and 
Pyke  (1985)  gave  an  example  of  change-point  problems  in  survival  analysis.  They  con¬ 
sidered  the  problem  of  testing  for  a  constant  failure  rate  against  alternatives  with  failure 
rates  involving  a  single  change-point.  Examples  of  change-point  problems  in  regression 
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models  were  discussed  in  a  number  of  papers.  With  three  econometric  examples.  Brown, 
Durbin,  and  Evans  (1975)  discussed  two- phase  multiple  regression  problems.  In  Esterby 
and  Elshaarawi  (1981),  a  two-phase  polynomial  regression  model  has  been  proposed  for 
the  pollen  concentration  in  lake  sediment  cores.  Also  Beals  (1972,  Chapter  12)  shows  a 
data  set  to  which  a  multi-phase  regression  model  can  be  applied.  In  this  dissertation,  we 
study  change-point  problems  in  regression  models,  especially  two-phase  linear  regression 
problems. 

1.2.  Two-Phase  Regression 

Regression  models  which  are  composed  of  two  different  linear  phases  have  many 
applications.  As  in  Brown,  Durbin  and  Evans  (1975),  it  might  be  suspected  that  the  slope 
and  the  intercept  have  changed  after  an  unknown  point  in  the  sequence  of  observations. 
In  some  cases,  it  may  be  necessary  to  consider  a  regression  model  in  which  only  one  of 
the  parameters  changes,  while  the  other  remains  constant.  Maronna  and  Yohai  (1978) 
considered  a  two-phase  regression  model  in  which  only  the  intercept  term  changes  and 
discussed  applications  in  meteorology. 

The  two-phase  regression  model  was  first  studied  by  Quandt  (1958)  who  proposed 
a  maximum  likelihood  method  to  estimate  the  parameters  in  the  broken  line  regression 
model.  Quandt  (1960)  also  proposed  a  likelihood  ratio  test  (LRT)  to  test  for  a  change 
in  the  regression  model  as  opposed  to  the  null  hypothesis  that  the  data  follow  only  one 
simple  linear  regression.  On  the  basis  of  the  empirical  distribution  resulting  from  some 
sampling  experiments,  he  concluded  that  -21og(likelihood  ratio)  could  not  have  a  chi-square 
distribution  with  the  appropriate  degrees  of  freedom  under  the  null  hypothesis. 

A  second  approach  to  the  problem  of  testing  for  a  change  in  a  regression  model 
is  to  use  recursive  residuals  introduced  by  Brown,  Durbin,  and  Evans  (1975).  Brown. 
Durbin,  and  Evans  developed  tests  based  on  the  cusum  and  cusum  of  squares  of  recursive 
residuals,  defined  to  be  uncorrelated  with  zero  means  and  constant  variance.  They  also 
considered  other  techniques  based  on  moving  regressions  and  on  the  regression  models 
whose  coefficients  are  polynomial  in  time.  As  well,  the  plotting  of  Quandt's  log  likelihood 
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ratio  statistic  (LRS)  was  suggested.  They  discussed  applications  of  these  techniques  to 
three  sets  of  real  data  taken  from  the  field  of  economics. 

Since  the  1960’s,  there  has  been  considerable  attention  to  the  estimation  of  parame¬ 
ters  as  well  as  the  problem  of  testing  for  a  change  in  the  regression  model.  Feder  (1975) 
showed  by  example  that  if  the  true  model  contains  fewer  phases  than  the  assumed  model, 
the  least  squares  estimators  are  not  asymptotically  normal  and  the  -21og(likelihood  ratio) 
statistic  is  not  asymptotically  chi-square.  He  also  concluded  that  the  asymptotic  null  dis¬ 
tribution  of  the  -21og(Iikelihood  ratio)  would  depend  on  the  configuration  of  the  values  of 
the  independent  variable.  Beckman  and  Cook  (1979)  further  investigated  the  dependence 
of  the  test  on  the  values  of  the  independent  variable  and  gave  critical  values  for  testing 
for  a  change  in  the  regression  model  by  simulation.  They  used  4-different  configurations 
of  the  values  of  the  independent  variable,  and  their  results  show  that  this  configuration 
can  have  a  significant  influence  on  the  null  distribution  of  the  LRS.  They  also  discussed 
differences  between  the  continuous  model  in  which  the  composite  regression  function  is 
constrained  to  be  continuous  at  the  change  point  and  the  discontinuous  model  in  which  it 
is  not.  Hawkins  (1980)  pointed  out  that  the  inferential  theory  of  the  two-phase  regression 
model  depends  strongly  on  whether  or  not  continuity  at  the  change-point  is  assumed. 

Difficulties  of  this  problem  are  the  facts  that  standard  maximum  likelihood  asymp¬ 
totic  theory  is  not  applicable  and  also  the  null  distribution  of  the  test  statistic  depends  on 
the  spacings  of  the  values  of  the  independent  variable.  The  sampling  distributions  of  most 
of  the  test  statistics  described  below  are  quite  complicated.  Because  of  this  complexity, 
most  previous  work  has  used  numerical  or  Monte  Carlo  methods.  In  1983,  Worsley  gave 
analytic  approximations  to  an  upper  bound  on  the  null  distribution  function  of  the  test 
statistic  based  on  an  improved  Bonferroni’s  inequality.  He  considered  a  general  multiple 
regression  model  with  a  normal  random  error  of  constant  variance,  where  there  may  be  a 
change  in  the  coefficient  vector  at  an  unknown  point  in  the  data.  Worsley’s  upper  bounds 
are  much  better  than  Bonferroni's.  However  it  requires  considerable  numerical  work  and 
sometimes  the  errors  are  quite  substantial  ,  especially  for  lareer  sample  sizes. 
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This  dissertation  focuses  on  the  problem  of  testing  for  a  change  in  the  regression 
model  when  the  errors  are  independently,  normally  distributed  with  constant  variance. 
In  this  dissertation,  two  kinds  of  models  are  considered.  First  is  the  regression  model 
in  which  only  the  intercept  changes  at  some  unknown  point  (Model- 1).  Secondly,  the 
model  in  which  both  the  intercept  and  the  slope  change  is  considered  (Model-2).  Model-2 
is  considered  without  continuity  constraint.  The  nature  of  the  null  distributions  of  these 
cases  are  as  follows  :  In  Model- 1,  if  the  variance  is  known,  then  the  LRS  is  the  maximum 
absolute  value  of  correlated  standard  normal  random  variables.  If  the  variance  is  unknown, 
then  the  LRS  is  the  maximum  absolute  value  of  the  ratios  of  correlated  standard  normal 
random  variables  and  the  square  root  of  a  chi-square  random  variable.  In  Model-2,  if  the 
variances  of  the  error  variable  is  known,  then  the  LRS  is  the  maximum  of  correlated  chi- 
square  random  variables  with  2  degrees  of  freedom.  If  the  variance  is  unknown,  then  the 
LRS  is  the  maximum  of  correlated  Beta  random  variables.  In  all  cases,  the  LRS  is  of  the 
form 

U  -  max  Ui, 

l<i<m 

where  distributions  of  Ui' s  vary  according  to  the  assumptions.  A  point  of  interest  is  how¬ 
to  deal  with  the  maximization  in  the  LRS.  Since  it  is  difficult  to  get  the  exact  distribution 
of  U ,  Beckman  and  Cook  (1979)  suggested  a  simple  bound  on  the  distribution  function 
based  on  Bonferroni’s  inequality  : 

Pr (U  >  u)  =  Pr(  (J  *  )  <  E  Pr('4‘)’ 

t  i 

where  A,  is  the  event  that  {U,  >  u}.  Worsley  (1983)  improved  this  upper  bound  by 
Pr(lf  >  u)  =  Pr(  U  Ai  )  <  £  Pr(A,)-£  Pr(A,p|AI+1), 

«  t  t 

In  this  dissertation,  the  LRT  is  considered  as  the  problem  of  the  boundary  crossing  by 
the  discrete  stochastic  process  and  an  approximation  to  the  null  distribution  function  is 
derived  under  mild  conditions. 

Chapter  2  deals  with  the  case  that  only  the  intercep'  "an  change  and  is  organized 
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as  follows.  In  Section  2.1,  the  modified  LRT  (MLRT)  to  test  for  a  change  only  in  the 
intercept  term  is  proposed.  Section  2.2  discusses  asymptotic  properties  of  test  statistics 
in  the  cases  of  random  and  fixed  independent  variables.  In  both  cases,  Section  2.3  gives 
analytic  approximations  to  significance  levels.  When  the  independent  variable  is  random, 
the  limiting  distribution  of  the  modified  LRS  (MLRS)  involves  a  Brownian  motion  and 
results  in  Siegmund  (1986)  are  used  to  approximate  significance  levels.  For  fixed  values  of 
the  independent  variable,  an  approximation  is  derived  assuming  that  the  variance  of  the 
error  variable  is  known  and  that  the  observations  of  the  independent  variable  satisfy  certain 
conditions.  Since  the  independent  variables  are  not  random  in  most  applications,  this  case 
is  the  most  important  and  the  most  difficult  one.  When  the  independent  variables  are 
nonrandom,  the  limiting  distribution  of  the  MLRS  is  not  a  function  of  a  well-known  process 
like  Brownian  motion.  However  it  involves  a  Gaussian  process  with  nondifferentiable 
sample  paths.  To  approximate  the  boundary  crossing  probability  by  a  discrete  stochastic 
process  whose  limiting  process  has  a  non-differentiable  sample  path,  the  argument  in 
Leadbetter,  Lindgren  and  Rootzen  (1983,Chapterl2),  modified  for  discrete  time  by  Hogan 
and  Siegmund  (1986),  is  used.  Section  2.4  is  concerned  with  power  of  the  MLRT  and 
confidence  regions  for  a  change  point. 

Chapter  3  obtains  results  like  those  of  Chapter  2  for  the  case  in  which  both  the 
intercept  and  the  slope  change. 

In  Chapters  2  and  3,  numerical  approximations  of  significance  levels  and  powers  of 
the  MLRT  and  the  results  of  corresponding  Monte  Carlo  experiments  are  also  reported. 
The  simulations  confirm  that  the  theoretical  rec’ilts  perform  well  and  demonstrate  that 
the  results  derived  under  the  assumption  that  variance  is  known  also  can  be  applied  to 
the  unknown  variance  case. 

Finally,  the  Appendix  reviews  several  basic  facts  concerning  the  convergence  of 


stochastic  processes  and  discusses  Siegmund's  (1986)  results  which  are  used  in  Chapters 
2  and  3. 


Chapter  2 


*1 


Change  in  Intercept  Alone 

2.1.  Models  and  Test  Statistics 

Let  (x;  ,  y, ),  j  =  be  a  sequence  of  m  pairs  of  observations  such  that 

y;  =  a(J)  +  /Jxj+£j  ,  where  a^’s  and  (3  are  unknown  parameters  and  £;’s  are  independently 
and  normally  distributed  with  mean  0  and  constant  variance  a2. 

Consider  the  problem  of  testing  the  null  hypothesis  that  the  data  follow  one  simple 
linear  regression  against  the  alternative  hypothesis  that  there  is  a  change  only  in  the 
intercept  term.  Then  the  hypotheses  can  be  described  more  formally  as 

Ho  :  a0)  =  a,  j  = 

H\  \  3  1  <  p  <  m  such  that 

a{,)  =  Qj,  j  =  1 . P, 

Q(j)  =  a2,  j  =  P+  1 

where  Qi  ^  02  . 

For  the  simple  case  of  [3  =  0.  this  problem  becomes  a  test  for  a  single  change  in  the 
mean  of  normal  random  variables  with  constant  variance.  Many  papers  have  investigated 
this  type  of  change-point  problem,  in  particulai  Gardner  (1969),  Hinkley  (1970),  Hawkins 
(1977),  Siegmund  (1986),  and  James,  James,  and  Siegmund  (1987).  Now  if  there  is  a 
covariate  which  has  a  constant  effect  on  the  y;'s,  the  two-phase  regression  model  introduced 
above  could  describe  the  situation.  This  kind  of  two-phase  regression  model  can  be  used  to 
describe  the  relationship  between  household  consumption  and  disposable  income  by  the 
household.  Household  consumption  cannot  be  explained  simply  by  disposable  income  of 
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Fig.1. 


disposable  annual  income  (1000  $) 


the  household.  Many  other  variables  such  as  age,  sex,  race,  and  education  of  the  family 
head,  may  also  affect  the  level  of  consumption  expenditures  of  the  household.  For  example, 
consumption  patterns  according  to  the  age  of  the  family  head  may  be  very  different.  If  a 
sample  survey  of  a  household  was  made  for  a  period  including  the  year  when  the  children 
of  the  family  began  to  live  independently,  then  the  data  might  be  plotted  as  in  Fig  l. 

This  testing  problem  was  first  studied  by  Maronna  and  Yohai  (1978).  They  studied 
the  LRT  and  also  discussed  some  applications  in  meteorology.  In  the  next  section  their 
approach  and  some  results  will  be  discussed. 

In  this  section,  we  derive  the  LRS  for  cases  of  known  and  unknown  a2  and  study  the 
null  and  alternative  distributions  of  the  LRS.  When  a1  is  known,  o2  can  be  assumed  to 
be  equal  to  1  without  loss  of  generality.  Then  -21og(likelihood  ratio)  statistic  for  testing 
H0  against  the  alternative  that  a  change  occurred  at  i  is  proportional  to 


_  f  tni  1  1  |  Qxz,m(yrn  Si )  Qzy.mi^m  ®i)  I 

\m-  tj  {Q2rm  _  Qtxm(xm  -  x,)3mi/(m  -  t)}J 

=  I  or.  —  ~  (*m  -  i.)2mi/{(m  -  i)<?.-.-.~.}W{*(m  ~ 
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where 


=  Si  =  »>)/*. 

j=l  j=i 

m  m 

x'  =  (  Y1  xj)Am  _  *).  y’  =  (J2  y3)Km  - «). 


j=i+l  >=«+l 

m  m 

Qxx,m  =  ^  ]  (xj  ~  xm)  »  Qxy,m  —  ^  ~  ~  5m)i 

J=1  J=1 

@  —  Qxy,m  /Q xx ,mi  dr,  —  Hi  ~  @xii  d,  ~  Vi  ~~  0xi- 

Hence  the  likelihood  ratio  test  (LRT)  of  H o  against  Hi  can  be  based  on 


max  |  Um(i/m)  |  . 

l<i<m 


Slightly  more  generally,  we  shall  consider  the  test  statistic 


Mi  =  max  |  Um(i/m)  |  ,  (2.1) 

mo  <t<m : 

where  1  <  m0  <  mi  <  m.  We  will  call  Mi  as  a,  modified  likelihood  ratio  statistic  (MLRS) 
and  the  test  based  on  Mi  as  a  modified  likelihood  test  (MLRT).  The  MLRT  was  introduced 
by  Siegmund  (1986)  who  used  the  MLRS  to  test  for  a  change  in  the  mean  of  a  sequence 
of  normal  random  variables.  The  introduction  of  mo  and  mi  in  (2.1)  can  be  justified  in 
terms  of  the  power  of  the  test.  Since  it  is  intrinsically  difficult  to  detect  a  change  occurring 
near  either  of  the  two  end  points,  the  LRS  pays  for  its  efforts  to  detect  such  a  change 
by  having  less  power  at  other  points.  This  will  be  more  completely  discussed  in  Section 
2.4  with  numerical  results.  Based  on  the  MLRS,  Ho  is  rejected  when  Mi  is  larger  than 
some  constant.  The  value  of  t  which  maximizes  )  Um(i/m )  |  is  the  maximum  likelihood 
estimate  of  the  true  change  point. 

Even  though  the  assumption  of  normal  random  errors  with  known  variance  simplifies 
this  problem,  theoretical  properties  of  M\  are  still  difficult  to  characterize.  Under  H0, 
q,  -  dr*  has  a  normal  distribution  with  mean  0  and  variance  [1  -  ( xm  -  f,)2mt/{(m  - 
OQxx.mJJ’ti/'Mm  -  »)}'  and  so  Um(t/m)  has  a  standard  normal  distribution  for  each  i. 
Hence  the  null  distribution  of  Mi  is  the  maximum  absolute  value  of  a  sequence  of  correlated 
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standard  normal  random  variables.  The  covariance  between  Um(i/m)  and  Um(j/m )  for 
i  <  j  is  given  by 


Cov  [  Um(i/m),  Um(j/m) 


>-{S 


/m)(l  -  i/m ) 


Dm(i/mJ/tn ) 


U/m)(l~j/m)t 


{Dm(i/m,  i/m)Dm(j/m,j /m)}^ 

(2.2) 


where 


Dm(i/m,j/m)  =  1  -  (xm  -  xt)(xm  -  x,)mj /[(m  -  j)Qzx,m]  for  i  <  j. 


The  derivation  of  (2.2)  will  be  given  in  the  following  section.  The  null  distribution  of  Mi 
depends  on  the  x/s  only  through  this  covariance  structure  of  {Um(i/m)},  not  on  a.  (3. 
Under  the  alternative,  Um(i/m)  is  normally  distributed  and  Cov[  Um(i/rn),Um(j /m)  ] 
remains  same  as  under  the  null.  But  now  t/m(j/m)  has  non-zero  mean  for  all  i ,  which  is 
given  by 


E[Um(i/m)) 


»(1  -  p/m)Dm(i/m,p/m) 

{t(l  -  i/m)Dm(i/m,i/m)}* 
p(  1  ~  i/rn)Dm(i/m,p/m) 

{i(l  -  i/m)Dm{i/m,i/Tn)}1 


(o2  -  Ol), 


(q2  -  Qi), 


i  <  P 
i  >  p. 


(2.3) 


So  the  alternative  distribution  of  M\  depends  on  the  unknown  parameter  ct2  -  ai  and  the 
unknown  change  point  p.  One  interesting  property  of  the  test  statistic  is  that  a  nuisance 
parameter  p  is  present  only  under  the  alternative.  This  property  makes  analysis  difficult 
since  the  standard  chi-square  approximation  of  -2log(likelihood  ratio)  can  not  be  applied 
in  this  case. 


If  <72  is  unknown,  the  LRS  is  proportional  to 

f  m2i  )  2  |  Qxz,m(ym  ~  Vi)  ~  Q zy.mi^m  ~  )  I 

t<,<rn  lm-tj  ^Qlzm  _  Qzxm{im  _  x,)2mi/(m  -  i))(Qw.m  -  Qlv.m/Qzz.m)}3 


~  max  \Um(i/m)\/cr  , 

!<«<m 


where  a  -  (Qvy,m  -  Qly.m/Qzz.m)/™- 
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Again  we  consider  the  generalization 

A/2  =  max  |  Um(i/m)  \  j  d  =  max  |  Um{i/m)  |, 
where  Um(i/m)  =  Um(i/m)  jo 

Under  Ho,  Um(i/m)  has  a  standard  normal  distribution  and  a2m  has  a  chi-square  distri¬ 
bution  with  m-2  degrees  of  freedom.  Since  Um(i/m)  and  &  are  not  independent  in  general 
and  the  distribution  of  Um(i/m)  depends  on  the  x/s  through  the  complicated  covariance 
structure,  it  is  difficult  to  find  the  exact  distribution  of  M2. 

The  dependence  of  test  statistics  on  the  values  of  the  independent  variable  is  one  of 
difficulties  that  must  be  handled  as  well  as  the  maximization  involved  in  the  definition  of 
the  test  statistics.  By  simulation,  Beckman  and  Cook  (1979)  pointed  out  that  the  influence 
of  the  configuration  of  the  values  of  the  independent  variable  is  non-negligible  and  the 
percentiles  of  the  test  statistic  increase  as  the  variance  of  the  configuration  increases. 
In  the  following  sections,  we  will  study  the  asymptotic  behavior  of  the  MLRS,  especially 
behavior  of  the  significance  level,  and  will  discuss  the  effect  of  the  spacings  of  the  values 
of  the  independent  variable. 

2.2.  Asymptotic  Behavior  of  Test  Statistics 

In  this  section,  we  study  asymptotic  properties  of  test  statistics  when  the  indepen¬ 
dent  variable  is  random  as  well  as  fixed.  The  regression  model  which  involves  a  random 
independent  variable  was  introduced  by  Maronna  and  Yohai  (1978).  This  model  is  appro¬ 
priate  when  the  dependent  variable  may  undergo  a  systematic  change  at  some  unknown 
point,  w'hile  the  independent  variable  does  not  change  and  affects  the  dependent  variable 
through  the  correlation  between  the  independent  and  dependent  variable.  Maronna  and 
Yohai  gave  an  example  of  such  a  situation  in  meteorology,  as  follows.  Let  x  and  y  be  two 
nearby  meteorological  stations.  The  measurements  might  be  mean  annual  precipitations 
and  it  might  be  desired  to  test  the  hypothesis  that  the  only  fluctuations  are  those  due  to 
the  intrinsic  randomness  of  the  magnitude  being  measured,  aeainst  the  alternative  that 
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a.  systematic  change  has  occurred  at  one  of  the  stations  after  some  point,  due  to  unregis¬ 
tered  changes  in  the  measurement  apparatus  or  the  location  of  the  station. 

In  Section  2.2.1,  we  study  the  case  in  which  the  independent  variable  is  random.  Also, 
the  asymptotic  behavior  of  the  MLRS  is  considered  conditionally  on  the  x/s.  Section 
2.2.2  deals  with  the  case  of  fixed  values  of  the  independent  variable.  Starting  from  the 
special  case  where  the  values  of  the  independent  variable  are  uniformly  spaced,  as  *hey 
would  be  if  the  independent  variable  is  time  and  observations  are  made  at  equal  intervals 
of  time,  we  study  the  limiting  behavior  of  the  MLRS  under  a  mild  assumption  about  the 
empirical  distribution  of  the  independent  variable.  In  Section  2.1,  the  LRS  was  derived 
assuming  that  the  e/s  are  identically  and  normally  distributed.  The  asymptotic  results  to 
be  discussed  in  Sections  2.2  and  2.3  holds  even  in  the  case  of  a  general  error  distribution. 

2.2.1.  When  the  independent  variable  is  random 

Maronna  and  Yohai  (1978)  considered  the  case  in  which  both  the  independent  and 
dependent  variables  are  random  and  they  studied  the  limiting  distribution  under  the  null 
hypothesis.  Since  the  LRS  does  not  depend  on  the  slope  under  the  null  hypothesis,  the 
independent  variable  can  be  taken  to  be  independent  of  the  dependent  variable.  They  gave 
the  percentiles  of  the  LRS  when  (x,y)  has  a  bivariate  normal  distribution  with  0  mean 
vector  and  identity  covariance  matrix,  obtained  by  the  Monte  Carlo  method.  Their  main 
result  is  about  the  limiting  distribution  of  the  test  statistic,  which  will  be  stated  in  the 
following  theorem.  It  was  shown  that  the  LRS  tends  to  oc  as  m  —  oo  in  their  paper.  Here, 
we  consider  the  MLRS  and  show  the  convergence  of  the  MLRS  in  distribution.  Basically 
this  theorem  was  proved  by  Maronna  and  Yohai,  but  their  proof  is  not  complete  in  some 
of  the  details  concerned  with  the  convergence  of  the  stochastic  process.  In  our  proof,  we 
consider  the  “convergence  in  distribution”  in  the  space  C  =  C[0, 1]  of  continuous  functions 
on  [0, 1],  equipped  with  a  <r-field  C  and  the  uniform  metric. 

Notation.  Let  W(i/m)  be  a  discrete  time  stochastic  process  defined  at  i  =  i,...,m. 
Then  Wc  denotes  a  process  which  is  continuous  in  [0, 1],  equals  W  at  t/m  (i  =  1, . .  .,m) 
and  is  linear  in  each  interval  (i/m,(i  +  l)/m)  . 


SWWCiWSI 
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Lemma  2.2.1. 

Let  {tij}  be  a  sequence  of  i.i.d.  random  variables  such  that  £[v,2]  =  1.  Define 
W°(i/m)  =  (»•  ~  where  Vi  =  Then  as  m  —  oc, 

VV^,C  — ►  W°  in  distribution,  (2.4) 

where  W°  is  a  Brownian  bridge  process. 

Proof  :  Define  Wm(i/m )  =  v,i/y/m  and  W^(t)  to  be  a  continuous  process  constructed 
by  linear  interpolation.  By  Donsker's  Theorem,  — »  W  in  distribution,  where  W 

is  a  standard  Brownian  motion.  Since  a  mapping  H  such  that  H(W£l)  =  Wm,c  and 
H(\V)  =  W°  is  continuous,  by  the  continuous  mapping  theorem  of  weak  convergence, 
(2.4)  holds.  | 


Theorem  2.2.2. 

Let  (xi,j/1),...,(im,ym)  be  i.i.d.  random  variables  such  that  £[xi2]  <  oc  and 
E[y i2]  <  oc. 

Under  Hq .  as  m  — >  oc  and  i/m  —  t, 

U^i/m)  W°[t) 


lTm(i/m)  = 


in  distribution, 


where  W/0  is  a  Brownian  bridge  process. 

And  so,  as  m  — *  oc  and  m,/m  — *  f,  for  t  =  0, 1, 

I  tT°(i/m)  |  _  I  W°{t) 


Mi  =  max 


ui&A.  — - - —  max  i 

{(,/„,)(!_ i/m)}*  {<(1-0}* 

Proof  : 

(i)  Note  that  ir^(i/m)  can  be  rewritten  as 

[. By(i/m )  -  Bx{i/m)Qiy<m/Qzx,m}/{Dm(i!m.i/m)}K 


in  distribution.  (2.5) 
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where 

By{i/m)  =  (ym  -  yi)i/y/rn, 

Bx{i/m)  =  (xm  -  x,)i/y/m, 

Dm(i/m, i/m)  =  1  -  [Bx(i/m)]2/{Qxx,m(i/m)(l  -  t/m)}. 

(ii)  It  may  be  assumed  without  loss  of  generality  that  £[x,]  =  £[y,]  =  0  and  £[x*]  = 
E[y2]  =  1  and  0  =  0. 

Then  the  Law  of  Large  Numbers  implies  that 

Qxx.m /rn  1,  Qxy,m/m  —  0  in  probability. 

By  the  previous  lemma. 

Bcx  — *  W®,  By  — »  in  distribution, 

and  hence 

Dm  —  1  in  probability, 
where  VT®  and  are  two  independent  Brownian  bridges. 

Then  the  continuous  mapping  theorem  implies  that  as  m  — ►  oc, 


k  * 
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Corollary  2.2.3. 


Under  the  same  assumption  as  in  Theorem  2.2.2, 


Mj  =  max 


££(«■/«)  I 


max 


W°(t) 


mo<.<*n  {(i/m)(l  -  t'/m)}*  t°^‘1  {<(1  -  <)}* 


in  distribution. 


Proof  :  Since  a2  is  a  consistent  estimator  of  a2  and  A/2  =  Mi/ff,  A/2  converges  to  the 
same  limit  as  in  (2.5)  by  the  Slutsky’s  Lemma.  | 


Now  we  will  consider  the  conditional  test  for  Ho-  This  conditional  test  is  based 
on  the  same  test  statistics.  Mi  or  A/2,  but  the  rejection  threshold  depends  on  the  x/s, 
which  are  ancillary.  In  the  following  theorem,  the  asymptotic  behavior  of  the  MLRS  will 
be  considered  conditionally  on  the  x/s  when  the  x/s  are  a  random  sample  from  some 
distribution. 

Theorem  2.2.4. 

Let  v'  =  (ij,  y} ),  j  -  1 . m,  be  a  sequence  of  i.i.d.  random  vectors  such  that 

E[yj)  =  and  £[v_,v']  =  E. 

Under  H0,  as  m  —  oc  and  m^/m  —  U  for  i  =  0, 1,  conditionally  given  xj,x2,  •  •  •„ 


M\  =  max 


t  m(*/™)  I 


max 


W°(t)| 


1UOA - Y  1 

_  j/m)}*  {t(l  -  t)}3 


in  distribution. 


with  probability  1. 

Proof :  This  theorem  is  proved  by  basically  the  same  argument  as  in  the  proof  of  Theorem 
2.2.2.  Note  that 


!£(*/"») 


Zm(i/m) 


Hm  (i/m) 


{(i/m)(l  -  i/m)}3  {(i/m)(\ -  i/m)} 


J  1 
2 
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where 

i 

s>  =  ~ 
j=i 

=  (*m  —  )t/y/tn  ,  0  —  Qzy.m/QzT.m  , 

Zm(i/m)  =  [5,  -  Smi/m]/[v/m£>m(i/m,t/m)], 

Rm(i/m)  -  {$  -  3)Bx(i/m)/ Dm{i/m,  i/m) 

Dm(i/m,  i/m)  =  1  -  [Br(i/m)]J/{QIIim(i/m)(l  -  J/m)}. 
Then  a.e.  in  x,  as  m  — *  oo. 


(i)  Zcm  —  IF0  in  distribution, 

(ii)  maxno<,<mi  Rm(i/m)  —  0  in  probability, 

(iii)  For  any  positive  s. 


Pr<  max 

l  ‘0<<<«1 


I  Zm(t)  I 
{'(!-<)}’ 


max 

mo  <  i  <  m  j 


I  Zcmd/m)  | 
{(i/m)(l  -  i/m)}* 


<  t. 


Combining  these  results,  proof  is  completed.  | 


In  proving  Theorem  2.2.4,  the  necessary  properties  of  the  x:'s  are 

m  m 

(52xj)/m-*a  and  GCzj)/m^6  ae- 

j=i  j=i 

In  particular, 

xm  -  x,  —  0  as  m  — *  oc  and  i  — *■  oc. 

By  the  Theorems  2.2.2  and  2.2.4,  it  can  be  said  that  if  the  values  of  the  independent 
variable  are  from  some  distribution,  then  the  test  statistic  converges  to  the  same  limiting 
distribution  whether  we  consider  the  test  as  the  conditional  or  the  unconditional  one. 

2.2.2.  When  the  independent  variable  is  fixed 

In  the  previous  section,  we  considered  a  case  where  ( Xj ,  y})  has  a  bivariate  distribution 
such  that  E[x*\  <  oo  and  E[yj]  <  oc.  As  a  conditional  test,  we  needed  the  convergence 
of  the  first  and  second  moments  of  the  independent  variable  to  get  the  above  limiting 
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distribution.  However,  the  independent  variable  is  fixed  in  most  applications  and  does 
not  satisfy  this  condition  in  general.  This  section  deals  with  asymptotic  properties  of 
the  MLRS  when  the  independent  variable  is  fixed.  We  begin  with  the  case  in  which  the 
values  of  the  independent  variable  are  uniformly  spaced.  If  x}  =  j/m  for  j  =  1, . .  .  ,m, 
then  xm  -  x,  =  (1  -  i/m)/2  -f*  0  any  more  and  so  the  limiting  distribution  is  not  the  same 
as  in  the  previous  section.  First,  we  shall  assume  that  o2  is  known  and  hence  without  loss 
of  generality  equals  one.  Under  the  null  hypothesis,  we  can  write  as  a  weighted 

sum  of  e/s  to  prove  Theorem  2.2.6  : 

m 

O  *'/"*)  =  <*..*£*< 

*=1 

where 


i 

j 

f  m  -  t 

(%i  ^m)(^k  ^m)l 

k  <  i 

/m)}^  ' 

[  mi 

q  r 

i 

J 

L±_ 

(x,  -  xm)(xk  -  Xm)'| 

k  >  a, 

{mDm(i/m,i 

l  m 

Q  XX, m  J 

Dm(i/m,j/m)  =  1  -  (xm  -  x,)(im  -  x })mj / {Q xz(m  -  j)}  for  j  >  i. 

(2.6) 

Lemma  2.2.5. 

Let  n  >  2  and  {Xm  =  . . .,  be  a  sequence  of  random  elements  of 

x£=1  Ck,  (where  C*  =  C[0,l])  equipped  with  a  product  tr-  field 

The  sequence  {X„,}  is  tight  if  and  only  if  the  n  sets  of  marginal  distributions, {Xi,m}, 
•  •  {X„,m},  are  tight  in  Ci,...,Cn. 

Proof:  Suppose  that  the  sequence  {Xm  =  . . .  ,  X„,m)}  is  tight.  Then  there  exists 

a  compact  set  K  in  XjjLjB*  such  that 

Pr{Xj  6  K}  >  1  -  e  for  all  X,  €  {Xm}. 

Let  h ,  be  the  mapping  that  carries  the  point  p  =  (pi,...,p„)  in  x"=1B,  to  P*  'n  f°r 
i  =  l,....n.  Since  h,  is  continuous  for  all  a,  K'  =  hx K  is  compact  and  so  h~lh'  D  K. 
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Then 

?t{X,,:  €  A"}  =  Pr{X;  6  h~l K'}  >  Pr{X;  €  K}  >  1  -  e, 
which  implies  that  the  sequences,  {ATi,m}, . . are  tight. 

Conversely,  suppose  that  {.Yi,m}, .  • .,  are  tight  sequences  of  random  elements. 

Choose  a  posith'e  c.  Then  for  each  i,  there  exists  a  compact  set  A',  in  B,  such  that 

Pr{XtJ  6  A',}  >  1  -  cfn.  for  all  XtJ  G  {Xhm}- 

Let  K  =  nr=1  ^,_1A',.  Then  K  is  compact  and 

n 

Pr{Xj  G  K}  >  1  -  53  Pr{X,,;  f  Kt}  >  1  -  £  for  all  X,  G  {XTO}. 

t=i 

Hence  {Xm}  is  a  tight  sequence  of  random  elements.  | 


Theorem  2.2.6. 

Suppose  that  x:  =  j/m  for  j  =  1 , . . . ,  m. 

Under  H0,  as  m  —  oc  and  m,/m  — *  t,  for  t  =  0,1, 


=  max  |  Um(i/m)  \  —  max  |  U(t)  |  in  distribution,  (2.7) 


where  U  is  a  Gaussian  process  with  mean  0  and  a  covariance  function, 

D{s,t) 


co^m.vM  i  =  {^} 

=  <r(t,s}, 


{D{t,t)D(s,s)}7 


where  D(s,t)  =  1  -  3s(l  -  t)  for  t  <  s. 
Proof :  Recall  that 


Um(i/m) 


Zm{i/m ) 

{(i/m)(l  -  i/m)}’ 


Rm{i/m) 

{(t/m)(l  -  i/m)}’ 


where  Zm  and  Rm  are  defined  in  the  proof  of  Theorem  2.2.4.  By  Theorem  A  1.1,  to  show 
that  Zcm  -  Rcm  converges  in  distribution,  we  have  only  to  check  that  the  finite  dimensional 
distributions  converge  in  distribution  and  the  sequence  is  tieht.  It  can  be  easily  shown 
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that  {Z^j  and  {££,}  are  tight-  To  prove  the  finite  dimensional  distributions  of  Z„  -  R cm 
converge  to  those  of  U°  where  U°  is  a  Gaussian  process  with  mean  vector  0  and  co%ariance 
matrix  A  to  be  defined  later,  we  will  show  that  for  any  sets  of  (rj, . . . ,  r„)  and  (ij, . . . ,  tn) 

such  that  (tj/m . i„/m)  — »  (<i, . .  -,tn)  as  m  — >  oo, 

n  n 

E[exp{i^2rk(Zcm{iklm)  -  Rcm(ik/m))}]  —  £[exp{t  r*L'°(t*)}]  as  m  —  oo. 

*=i  fc=i 

By  using  (2.6), 

n  n  m 

£[exp{i^r*(Z^(t'*/m)  -  Rcm(ik/m))}}  =  £[exp{t }] 

k= 1  k=\  j  =  l 

=  £[exp  t(b'c)j, 

where  b  =  (6i,...,6m)  with  6j  =  5Zfc=i  and  e  =  (e|,...,em).  Now  elementary 


algebra  shows  that 


5Z6J  -  r'Ar' 


where  r'  =  (rj, . .  .,rn)  ,  and  A  is  a  matrix  whose  (fc,/)-th  entry  is  <r(t*.  t/){t/t(l  -  <*)</(  1  - 
t/)}i.  Hence  this  implies  that 

(Z^ujm)  -  Rcm{i\/m), . ... Zcm(in/m )  -  RcJin/m)) 

—  (U°(ti), . .  .,U°(tn))  in  distribution. 

Now  tightness  of  the  sequence  {Z£.  —  R^}  follows  from  Lemma  2.2.5  and  Lemma  7 
(Billingsley,  196*).  Thus  Z% ,  -  Rcm  converges  to  U°  in  distribution. 


By  the  continuous  mapping  theorem, 


Afi  -*  max  - r 

<o<^.  {/(i  _  *)}j- 


U°(t) 


in  distribution. 


It  is  evident  that  U°{t)/{t(  1  -  t)}*  is  a  Gaussian  process.  Since  a  Gaussian  process  is 
completely  determined  by  mean  vector  and  covariance  matrix,  (2.7)  holds.  | 

Now  we  will  generalize  this  result  to  the  case  where  x:  =  f(j /m)  for  some  integrable 
function  f. 
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Lemma  2.2.7. 

Suppose  that  x}  =  f(j/m)  j  =  l,...,m,  for  some  integrable  function  /  such  that 
/( 0)  =  0  and  /( 1)  =  1. 


Then  for  i  <  j ,  as  i/m  — *  t,  j /m  — *  s,  m  — «  oc, 

(  I i /mil  1  —  i  /ml  1  i 


{Dm(i/m,  /m,j J m)Y‘ 


where 


gm{i/m)  =  (xm  ~  x,)/ {(1  -  i/m)s/Qzz/m}s 

Dm{i/m,j /m)  =  1  -  (xm  -  -  Xj)mj/{Qzz(m  -  j)} 

=  1  -  (j/m)(  1  -  i / m)gm(i / m)grn(j / m)  for  j  >  i. 


nq-sn*  p(m) 


where 


s(0  = 


Jo  -  llo  f(^)du}/t 


0  ~  <){/0'  f2{u)du  -  [/J  f(u)du]  }2 

D(t.s)  =  l  -  s(l  -  t)g(t)g(s)  for  s  >  t. 

Proof  :  First  we  will  derive  Cov  [Vm(i/m),  lTm{j/m)]  for  i  <  j.  Note  that  Um(i/m)  can 
be  written  as 

[(i/m  -  9i)-  (Xm  -  Xx)Qxv.m/Qxx.m}/ {Drn(i/m<i/m)(m  -  i)/(mi)}> . 

Assuming  the  e^’s  are  normally  distributed  with  mean  0  and  variance  1,  it  can  be  easily 
shown  that 

/  Vm  -  Vi  \  v  (  (  ~  ^«)\  /  (n‘  -  (*m  ~  *i)/Qxz.m  \  ^ 

\(xm  —  Xt)p  J  \  3(xm  -  X,)  /  \  (xm  —  Xi)/Qzz<m  (xm  —  Xt)2/Qzx,m  ) 

where  /?  =  Qxy.m/Q  fi.m* 

Also  it  is  straightforward  to  show 

Cov [ym  -  y,,ym  -  y,]  = 


V  lrjTfu 
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Cov  ( P.(,7m).  C-rnU/m)  1  =  {  S 

l  0/m)(l  —  t/m)  J  {Dm{iimj/m)Dm(j/rn^ 


/m)}* 


*m  “  *i  *  J]  “  12  /(*/m)/» 


<?rr,m  =  Y^f2{k/m)-  [  j>2 /(fc/m)]  /TO, 

fc=l  fc=] 

it  can  be  easily  shown  that  gm{i/m)  — *  j(t)  as  m  — *  oc,  i  — *  oo.  Then  (2.8)  follows 
immediately.  | 

Lemma  2.2.7  says  that  the  test  statistic  depends  on  the  x/s  only  through  the  function, 
jtm.  When  x}  =  j/m,  gm(j/m)  =  for  all  j.  The  same  argument  as  in  the  proof  of 
Theorem  2.2.6  leads  to  the  following  theorem. 

Theorem  2.2.8. 

Suppose  that  x}  -  f(j/m)  j  =  1  for  some  integrable  function  f  such  that 

/(0)  =  0  and  /(l)  =  1. 

Under  Hq.  as  m  — >  oc  and  m,/m  — *  t,  for  i  =  0. 1, 

Mi  =  max  |  Um{i/rn)  |  — *  max  |  U(t)  |  in  distribution.  (2.9) 

where  U  is  a  Gaussian  process  with  mean  0  and  a  covariance  function 


DM 

{D(t,t)D(s,s)}‘ 


=  <r(f,a), 

where  D(s,t)  =  1  -  s(l  -  t)g(t)g(s)  for  t  <  s. 

Corollary  2.2.9. 

Under  the  same  assumption  as  in  Theorem  2.2.8, 

M2  =  max  I  Umii/rn)  (  —  max  |  l'(t)  \  in  distribution. 


I 

I 


I 

% 
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>"VV7 


Remark  2.1.  Comparing  with  the  case  in  which  the  independent  variable  is  random,  we 
see  that  D(t,  s)/ {D(t,t)D(s,  s)}^  is  an  additional  factor  in  the  covariance  of  the  limiting 
process.  Since  the  mean  and  variance  remain  the  same,  the  configuration  of  the  values  of 
the  independent  variable  affect  the  distribution  of  the  MLRS  only  through  this  additional 
term  in  the  covariance  function. 

Remark  2.2.  Since  the  limiting  distribution  of  the  MLRS  in  (2.5)  involves  Brownian 
motion,  the  stochastic  process  in  that  limit  has  non-differentiable  sample  paths.  In  the 
case  of  the  fixed  independent  variable,  U  in  (2.7)  also  has  non-differentiable  sample  paths. 

2.3.  Approximations  to  Significance  Levels 

As  described  in  Section  2.1,  the  exact  distributions  of  the  test  statistics  are  quite 
difficult  to  analyse.  In  this  section,  we  give  approximations  to  the  right-hand  tail  of 
the  null  distributions  of  Afj  and  M 2  and  perform  Monte  Carlo  experiments  to  see  how 
accurate  these  approximations  are.  As  in  Section  2.2,  we  first  consider  the  case  in  which 
the  Xj's  are  random  and  then  the  case  where  the  xj's  are  fixed.  In  both  cases,  we  study 
the  significance  levels  as  boundary  crossing  probabilities  by  discrete  stochastic  processes 
with  nondifferentiable  sample  paths.  Approximations  to  significance  levels  are  derived  for 
the  MLRS  with  known  variance.  Mi,  and  will  be  discussed  how  well  this  can  be  applied 
to  the  unknown  variance  case. 

2.3.1.  When  the  independent  variable  is  random 

When  the  independent  variable  is  random,  Theorem  2.2.2  shows  that  the  MLRS,  M\. 


converges  to 


W°(t) 


in  distribution. 


{t(l-t)}* 

where  W°  is  a  Brownian  bridge  process  on  [0,1]  and  m,/m  —  t,,  for  i  =  0, 1,  as  m  — »  oc. 
Thus  the  significance  level  of  the  test,  Pr{Mi  >  6},  can  be  approximated  by  that  of  this 
limiting  distribution.  Siegmund(1986)  provides  the  approximation  to 

Pr{  max  |  W°(t)  \f{t(l  -  <)}>  >  6}. 

'  to  <*<<1 
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which  is  quite  general  since  it  can  be  applied  also  if  the  underlying  distribution  of  the 
observations  is  not  normal.  However  Table  1  shows  that  this  approximation  overestimates 
the  actual  values  about  200  %.  In  Table  1  and  in  other  tables,  we  obtained  the  percentiles 
by  a  Monte  Carlo  experiment  using  simple  random  sampling  with  10,000  samples  for  each 
situation.  As  a  correction  for  discrete  time,  (19)  in  James,  James,  and  Siegmund  (1987) 
was  used  and  that  result  is  also  summarized  in  Table  1.  This  discrete  approximation  does 
not  perform  perfectly  but  it  gives  a  rough  idea  about  the  significance  level. 

Table  2  concerns  the  MLRS  with  unknown  variance,  A/2.  Using  (21)  in  James,  James, 
and  Siegmund  (1987),  a  similar  kind  of  result  is  obtained.  The  numbers  in  parenthesis 
are  the  approximations  to  the  significance  levels  of  A/2  using  the  approximation  derived 
for  the  known  variance  case.  This  gives  some  insight  about  whether  the  approximation 
derived  for  the  known  variance  case  can  be  applied  to  the  unknown  variance  case.  Since 
in  the  next  section  we  will  derive  an  approximation  to  the  significance  level  of  when 
the  independent  variable  is  fixed  and  see  how  that  works  for  the  unknown  variance  case, 
we  will  discuss  this  more  later. 

2.3.2.  When  the  independent  variable  is  fixed 

As  we  can  see  in  Theorem  2.2.6  and  Theorem  2.2.8,  the  limiting  distribution  is 
not  a  function  of  a  Brownian  motion  but  involves  a  different  Gaussian  process  when 
the  independent  variable  is  fixed.  In  this  section,  in  order  to  get  an  approximation  to 
the  significance  level  of  M\,  we  begin  with  the  case  where  x}  =  j/m  and  later  consider 
more  general  configurations  of  the  independent  variable.  In  principle,  Durbin  (1985)  gave 
approximation  formula  to  the  probability  of  boundary  crossing  by  a  continuous  Gaussian 
process  satisfying  some  conditions.  However  as  before  these  are  not  accurate  since  these 
did  not  take  discreteness  into  consideration. 

The  main  result  of  this  section  is  a  new  approximation  taking  discreteness  into  ac¬ 
count,  Assuming  the  normality  of  the  error  variable,  we  can  consider  the  significance 
level  of  M\  as  a  boundary  crossing  probability  by  the  Gaussian  process,  Um ,  defined 
on  {i  :  mo  <  i  <  mi).  As  discussed  before,  our  Gaussian  process  is  nonstationary  and 


-^’•wri.riiTtL . 


•JO,."  «.■  v 


*50  ■-  -.'  -»* 
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nondifferentiable.  To  approximate  the  boundary  crossing  probability  by  the  discrete 
stochastic  process  whose  limiting  process  has  a  non-differentiable  sample  path,  the  ar¬ 
gument  in  Leadbetter,  Lindgren  and  Rootzen(1983,  Chapter  12),  as  modified  for  discrete 
time  by  Hogan  and  Siegmund  (1986),  will  be  used.  We  start  with  the  given  discrete 
time  Gaussian  process  and  derive  an  approximation  to  the  boundary  crossing  probabil¬ 
ity  by  this  discrete  process  as  the  sample  size  gets  large.  In  Leadbetter,  Lindgren  and 
Rootzen  (1983,  Chapterl2),  their  goal  is  to  approximate  the  boundary  crossing  proba¬ 
bility  by  a  non-differentiable  continuous  Gaussian  process.  They  considered  the  prob¬ 
ability  of  crossing  the  boundary  by  the  given  process  at  discrete  instants  of  time  first 
and  let  the  interval  of  each  time  points  get  smaller  and  smaller  .  Actually  we  get  the 
same  result  if  we  consider  the  continuous  limiting  process  and  find  an  asymptotic  ex¬ 
pression  for  the  boundary  crossing  probability  by  this  limiting  process  observed  only  at 
the  discrete  instants  of  time.  From  Lemma  2.3.1  through  Theorem  2.3.5,  it  is  assumed 
xj  =  j/m  for  j  =  1, . . . ,  m,  and  to  obtain  nontrivial  limits  as  b  — ♦  oo,  we  use  the  normal¬ 
ized  process,  m(t)  =  b(Um(t  +  i/m)  -  6),  where  b2/m  — ►  a.  In  order  to  state  approxima¬ 
tions  to  the  significance  levels  in  Theorems  2.3.5  and  2.3.7,  it  is  helpful  to  introduce  the 


function 


00  1 

i/(x)  =  2x~ 2  exp  {-2  ^  n''$(--xn^)},  (x  >  0) 


(2.10) 


where  $  denotes  the  standard  normal  distribution  function.  The  function  v  was  used 
by  Siegmund  (1985)  and  is  easily  evaluated  numerically  by  (2.10)  or  approximately  as 
suggested  in  Siegmund  (1985,  ChX). 

Lemma  2.3.1. 

Suppose  that  x:  =  j/m  for  j  =  1, . . .,  m. 

Let  Ulm(i)  =  6(L’m(t  +  i/m)  -  b),  and  suppose  m  — ►  oo  ,  6  — *  oo  so  that  b2/m  — <•  a. 
Then,  the  conditional  distributions  of  U/m(i)  given  that  U^m( 0)  =  x  are  normal  with 

£[^.m(«')-x|L6'm(0)  =  x]  =  -Mo«)‘  +  °(l),  (2.11) 

Cov[r‘,m(i)  -  X,  Ultn(j)  -  x|C(°)  =  *1  =  2/iot*;  min(».j)  +  o(l).  (2.12) 


I 


I 
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where  pa(t)  =  a/[2t(l  -  <){1  -  3t(i  -  *)}]• 

Proof  :  Using  Taylor  series  expansion  of  covariance  functions  and  doing  a  tedious  calcu¬ 
lation,  (2.11)  and  (2.12)  are  obtained.  | 


The  first  step  in  our  derivation  to  the  distribution  of  M j  is  to  consider  the  maximum 
taken  over  a  fixed  number  of  points,  t,t  +  1/m,  ...,t  +  n/m. 

Lemma  2.3.2. 

For  fixed  n  and  a,  as  6  —  oc  and  m  —  oc, 

Pr{  max  Um(t  +  i/m)  >  i>}/[^^]  —  1  +  Ha(t,n),  (2.13) 

0<i<n  0 


where 


fO 

Ha(t,n)=  /  exp(-i)Pr{  max  Vj(i)  >  xjdx, 

J-oc  0<i<n 

and  Y*(i)  is  a  partial  sum  of  i.i.d.  random  variables  with 
i;‘(D~.v(-Ma(0,2Ma(0)- 


Proof  :  Since  the  conditional  distribution  is  normal,  it  is  determined  by  its  mean  and 
covariance.  Then  the  previous  lemma  implies  the  limiting  process  can  be  represented  by 

>;'(«)  =  *«(W)  - 

where  W  is  a  standard  Brownian  motion  and  =  2pa(t). 

Then,  following  the  same  argument  in  Lemma  12.2.3  of  Leadbetter,  Lindgren,  and  Rootzen 
(1983),  (2.13)  holds.  | 


Lemma  2.3.3. 

There  exists  a  function  #*(<)  such  that 


lim  Ha{t,n)/n  =  ff‘(t)  uni  form  !v  in  t. 
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As  6  — »  oc  and  m  —  oo, 


Pr{  max  t/m(*/m)  >  fr}/  [6<K*)]  —  /  H'(t)dt/a.  (2.14) 

<0<i/m<«i  JtQ 


Proof :  Let 


Bk  =  {  max  Um(i/m)  >  6} 

=  {  max  Um(kn/m  +  i/m)  >  6} 

0<i<n  -1 


Then  it  can  be  shown  that 

K> 

Pr{  max  Um(i/m )  >  b]  ~  Y  P{Bfc}, 


where  [A"onJ  =  m0,  [A'lnJ  =  mi,  and  |xj  denotes  the  greatest  integer  which  is  less  than 
x. 


By  Lemma  2.3.2, 


And  thus 


Y  P{Bk }  ~  [d>(6)/6]  Y  (1  +  JM*»/m.n)] 

*=A'o  k=K0 

K i 

~  6d»(6)[l/na  +  Y,  Haiku/m,  n)/62] . 
k-K0 


Pr{  max  Um{i/m)  >  b} /[b(p(b)\ 
Ki 

~  l/na+  ^  Ha(kn/m,n)/b 2 
(c= A'0 

~  l/na  +  f  Ha(t,n)dt/(na ) 


The  proof  is  completed  by  letting  n  -*  oc  and  proceeding  as  in  Lemma  12.2.4  of  Leadbet- 
ter,  Lindgren,  and  Rootzen  (1983).  | 

The  last  step  is  to  evaluate  /7*  in  (2.14).  In  evaluating  /?*,  we  use  the  argument 
in  Siegmund  (1985.  Ch  VIII),  which  leads  to  the  derivation  of  the  boundary  crossing 
probability  by  a  random  walk  with  unit  variance. 


$ 
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Lemma  2.3.4. 


(2.15) 


where 


Proof :  Note  that 


/  H‘(t)dt/a  =  /  Ma(0t/[2^a(t)]  *  /a, 

•/lo  •'to 

M0  =  a/l2t(l-t){l-3t(l-t)}], 

^(0  =  W<)/2}*. 

/•o 

Ha(t,n)  =  /  exp(-x) Pr{  max  Ya‘(i)  >  x}dx, 

J-oo  OS'S" 

where  Y*(i)  is  a  partial  sum  of  i-i.i.d.random  variables  defined  in  Lemma  2  3.2.  Let 
Ya*(l)  =  Ya(l)/<xa(t)  to  make  the  variance  equal  to  1. 

Then  the  Wald’s  likelihood  ratio  identity  implies  that 

Ha (t,n)  =  [  exp[y<ra(t)jPr{  max  Y‘’*(t)  >  y}dy , 

Jo  °S'Sn 

f°° 

=  o o(<)  /  £>*[exp{-2/i*(t).fty}  :  Ty  <  n]dy, 

Jo 

where 

Ty  =  in/{n  >  1  :  Y‘'*(n)  >  y}, 

=  Y'a'-(Ty)  -  y. 

Hence  it  suffices  to  evaluate  the  limit  as  n  — *  oc  of 

n'1  /  £)i*[exp{-2/i*(t).Ry}  :  Ty  <  n]dy. 

Jo 

By  the  same  argument  in  Lemma  3.4  of  Hogan  and  Siegmund  (1986),  this  is  approximated 
by  //*(<)i/[2/i*(f)],  as  n  — +  oo.  Therefore  (2.15)  holds.  | 


By  combining  Lemmas  2.3.1,  2.3.2,  2.3.3.  and  2.3.4,  we  obtain  the  following  approxi¬ 
mation  to  the  tail  of  the  distribution  of  the  maximum,  M\  =  rnaxmo<,<m.  Um(i/rn),  over 
an  interval  (mo,  mi]. 


•ki  <  .*  *.  ■  *,  i 
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Theorem  2.3.5. 


Assume  that  6  — ♦  oo,  mo  — *■  oc,  mi  — ►  oc,  and  m  — ►  oo  in  such  a  way  that  for  some 
0  <  fo  <  ti  <  1  and  a  >  0 


m,/m  — ►  t,,  i  =  0, 1  and  b2/m  — *  a. 


Then  as  m  —  oc, 


Pr{  max  Um{i/m)  >  6}  -  b<p(b)  [  u[2 p’a(t))pa(t)dt/a, 

where 

/xa(<)  =  a/[t(l  -  t){l  -  3<(1  -  t)>], 
t£(t)  =  {na(t)/2}> /2. 

Remark  2.3.  When  x}  =  j/m,  j  =  l,...,m,  the  significance  level  of  the  test  can  be 
approximated  by 


Pr{  max  \Um{i/m)\  >  6}  ~  2 Pr{  max  Vm{i/m)  >  6} 

mo  K » K  m  i  mo 


b<t>{b)  j  '  2pa(t)v[2pma{t)]dt/a.  (2.16) 

Jt0 


Table  3  gives  an  indication  of  the  accuracy  of  (2.16).  As  before,  percentiles  of  M i,  6j, 
were  obtained  by  the  same  kind  of  Monte  Carlo  experiment.  Table  3  also  indicates  that 
the  approximation  (2.16)  can  be  applied  to  the  unknown  variance  case.  In  Table  3,  b 2  are 
the  percentiles  of  M2  for  various  sample  sizes  and  it  can  be  said  that  approximations  are 
reasonably  accurate  if  sample  sizes  are  big  enough  and  a  <  0.1.  Since  the  case  of  x:  =  j /m 
can  be  applied  to  the  regression  model  in  which  1  are  equally  spaced  time  points,  which 
arises  often  in  statistical  analysis,  we  provide  in  Table  4  the  tail  probabilities  of  M\  when 
x}  —  j/m  under  Hq. 

In  the  remaining  part  of  this  section,  an  approximation  to  the  significance  level  for  a 
general  configuration  of  the  values  of  the  Xj's  will  be  derived  and  numerical  results  will  be 
presented.  Proofs  will  be  omitted  since  they  follow  closely  tho>e  of  the  previous  theorem. 


v’v:* v.  a. 
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Lemma  2.3.6. 

Suppose  that  x:  =  j  =  for  some  integrable  function  /  such  that 

/( 0)  =  0  and  /( 1)  =  1. 

Then  as  m  — *  oo  and  6  — »  oo  in  such  a  way  that  62/m  — *  a, 

f(C(*)  -  *(^.«(0)  *  *1  -  -/*.(<)*■,  (2-17) 

Cov  -  x ,  {/fc‘,m(j)  -  x|E/t6m(0)  =  x]  -  2/i.(*)min(i,j),  (2.18) 

where  pa(<)  =  a /[2<(  1  -  /){l  -  ^2(#)<(  1  -  <)}]>  a*id  5  was  defined  in  Lemma  2.2.6. 
Proof  :  (2.17)  and  (2.18)  directly  follow  from  a  long  calculation.  | 

Theorem  2.3.7. 

Suppose  that  Xj  =  f(j/m),  j  =  1, . . .,  m,  for  some  integrable  function  /  such  that 
/(0)  =  0  and  /(l)  =  1. 

Assume  that  b  — »  oc,  mo  — ►  oo,  mj  -*  oo,  and  m  — »  oo  in  such  a  way  that  for  some 
0  <  to  <  <i  <  1  and  a  >  0 

m,/m  —  t;,  i  =  0. 1  and  62/m  -♦  a. 

Then  as  m  -*  oc, 

Pr{  max  |l/m(i/m)|  >  6} 
m0 

-  60(6)  fU  u[2nl(t)]/[t{  1  -  <){1  -  -  0}]*i  (219) 

Jto 

where  pj(f)  =  {a/[t(l  -  <){ 1  -  g2(t)t(l  -  0}]}^/2- 

Table  5  supports  that  the  theoretical  approximation  (2.19)  is  quite  accurate  when 
Xj  =  (j/m)2.  In  Table  5,  pi  is  obtained  for  /(x)  =  x2.  and  j>2  is  the  approximation  using 
the  linear  interpolation  of  the  x/s  as  /.  We  get  the  percentiles  for  unknown  variance  case 
by  Monte  Carlo  method  and  approximates  significance  levels  using  the  approximation 
formula  derived  for  known  variance  case.  Even  though  they  are  not  perfect,  a  rough  idea 
about  the  tail  probability  can  be  obtained  from  them. 


t.  ,  cm  ,■>  hM—aaaaM mb  ■■■  iYi  i  ht  rirtimif  MMMm 
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2.4.  Powers  and  Confidence  Regions 

In  this  section  we  follow  the  arguments  of  James,  James,  and  Siegmund(198?)  to 
obtain  an  approximation  to  the  power  of  (2.1).  We  derive  an  approximation  of  the  power 
of  (2. 1 )  for  the  fixed  x} ’s,  starting  from  the  uniformly  spaced  x} ’s.  Suppose  that  we  observe 
3/1 ,  as  in  section  2.2  and  Xj  =  j/m  for  j  =  l,...,m,  that  there  is  exactly  one 

change  point,  p,  only  in  the  intercept  term  of  the  regression  line  and  that  Qi,  02,  and  /? 
are  unknown  parameters.  In  order  to  get  an  intuitive  idea  of  the  boundary  crossing  by  the 
given  stochastic  process,  we  consider  a  modified  stochastic  process  and  a  curved  boundary 
as  follows.  Let  U^(i)  =  -  i/m)}*.  Then  from  (2.2)  and  (2.3)  it  can  be  easily 

seen  that  the  process  t/„(i)  (t  =  mo,.  ..,mi)  has  the  mean  value, 

E[irm(i)\  =  i(l  -  p/m)  /f  (oi  -  on),  «<p 


{ Dm(i/m,i/m)}l 

,,  . ,  ,  Dm(i/m,p/m)  ,  , 

=  p(l  -  t/m)— L  — - r(o2  -  Qi),  i  >  p, 


{Z)m(t/m,t/m)}2 


(2.20) 


and  the  covariance  function  for  i  <  j , 


Cov  [t£(0,  IZU)}  =«•(!-  j/m)rm(i,j ) 


(2.21) 


where 


Dm{i/m,j/m)  =  1  -  (xm  -  x,)(xm  -  x:)mj/{Qzx<m{m  -  j)}  for  i  <  j , 
=  Dm(i/m,j/m)/{Dm(i/m,i/m)Dm(j/m,j/m)}* . 

For  1  <  m0  <  mi  <  m,  let 

To  =  inf {t  :  i  >  m0,  |  C£,(»)  I  >  M*(  1  -  *7™)}’}> 

Tx  =  sup{t  :  i  <  mi,  |  C/^(t)  |  >  6{i(l  -  »/m)}*}, 


(2.22) 


and  let  Pr^  {To  <  m i}  =  Pr{To  <  mi  |{/^,(p)  =  0-  The  power  of  the  test  defined  by 
(2.1)  is  of  the  form.  Prp{Mi  >  6}  =  Prp{To  <  mi),  where  mo  <  p  <  mj.  It  is  obvious 
that 

PrP{T0  <  m,}  =  Pr{  |  Vm (p)  |  >  6{p(l  -  p/m)}*} 

+  Pr{  |r;(p)|<h{p(l-p/m)}*.  To  <  m,} 


a 


* 


a 


Section  2.4:  Powers  and  Confidence  Regions  31 


=  Pr{  I  IUp)  I  >MP(1- P/m)}*} 

+  /  Pr[p){r0  <  m,}  Pr{t/;(p)  €  d{}  (2.23) 

Since  the  marginal  distribution  of  £/^,(«)  is  known,  to  approximate  (2.23)  it  suffices  to 
approximate  the  conditional  probability  in  (2.23).  To  approximate  Pr^{7o  <  m,}  we 
may  assume  that 

Ifl  =  MpO  “  P/m)}*  -  i  (2.24) 

with  x  —  0(1)  as  m  — *  oc,  since  the  principal  contribution  to  the  integral  on  the  right- 
hand  side  of  (2.23)  comes  from  values  of  £  close  to  the  boundary  value.  Given  U^(p)  = 
of  the  form  (2.24),  if  |  U„(i)  |  >  6{«(1  -  i/m)}*  for  some  mo  <  i  <  p  and  |  U^{j)\  > 
b{j(  1  -  j/m)}*  for  some  p  <  j  <  mj,  this  event  with  overwhelming  probability  occurs  for 
some  i  and  j  which  are  closed  to  p.  Moreover,  given  U^(p)  =  (,  asymptotically  as  m  — *  oc 
the  processes  U^(i)  (i  =  m0,...,p)  and  U^(j)  U  —  P  +  l,...,mj)  are  conditionally 
independent  for  i  and  j  close  to  p.  Thus  we  can  write 

Pr^To  <  mi}  Sf  Pr^To  <  p}  +  Pr^J,  >  p}  -  Pt{?{T0  <  p)  P r((p){T:  <  p}.  (2.25) 

Since  both  probabilities  on  the  right  hand  side  of  (2.25)  are  of  the  same  form,  it  is  enough 
to  consider  the  first  one.  To  approximate  the  first  probability,  we  assume  that  m  is  large 
and  that  p  and  p  -  m0  are  proportional  to  m. 

Lemma  2.4.1. 

Let  £  =  6{p(l  -  p/m)}*  -  x  =  m(0-  For  i  <  p,  given  U„{p)  =  £, 
as  m  — -  oc,  p/m  — *  tm,  for  each  fixed  i, 

[  ^m(P  -  *)  -  p(p  -  *  |  p)  ]  W.O}* 

is  distributed  approximately  as  5,  =  •?*,  where  a*’s  are  i.i.d.  standard  normal 

random  variables  and  p(p  —  *  |  p)  =  E[V^(p  -  i)  |  U^(p)  =  £], 


Section  2.4:  Powers  and  Confidence  Regions  32 


Proof :  Since  £^(i)  is  distributed  as  N(n(i\p),o\i\p))  given  U„(p)  =  (,  where 
p{i\p)  =  £rm(i,  p)i/ p, 

°2{'\P)  =  «(1  -  i/m)  -  [rm(t,  p)]2t'2(l  -  p/m)/p , 
it  can  be  easily  obtained  that  as  m  — >  oo, 


Cov  [U^(p  -  i),  U'm{p  -  j)  |  U’m[p)  =  f]  -  min(i,j)/D(C1C). 

Thus  [U^(p  -  t)  -  p(p  -  i  |  p)]{D(f\  t*)}^  behaves  like  a  sum  of  independent  normally 
distributed  random  variables,  each  having  mean  0  and  variance  1.  | 

Now  we  define  stopping  times  r£  and  Tq  as  follows  : 

To  =  inf{*  :  i  >  ™o,  U„(i)  >  6{t(l  -  i/m)}2}, 

To  =  inf{*  :  *  ^  m0<  Um(i)  <  — *{*(1  -  */rn)}^}. 

Lemma  2,4.2. 

Suppose  that  6  —  oc,  p  — *  oc.  m  —  oo  in  such  a  way  that  b/y/m  — ►  60,  and 
p/m  — ►  Let  x  =  6{p(l  -  p/m)}^  -  £  =  0(1). 


Then  as  m  — ♦  oo, 


and  similarly 


Pr(?’{r0+  <  p}  =  i/[277]exp[-2r?x{Z?(/*,f)}?], 


Pr(^{r0  <  p}  =  i/[27?]exp[-2r7i{Z?(r,t*)}j], 


(2.26) 


where  r?  =  6o/[2{Z)(t*,t*)<*(l  -  t* )} a]. 

Proof :  By  (2.20)  and  elementary  calculus,  it  can  be  seen  that  for  fixed  x,  t, 

H(P~  0(1  -  (P“  0/H}*  -  p(p-  »|p)  —  i  +  ib0/[2D(t‘,t‘){tm{l  -  <’)}*] 


§ 

S 


I 

| 


i 

I 
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From  Lemmas  2.4.1, 

Pr(f{ro  <  P)  =  ~  0  ~  M(P  -  i|p)  >  H(P  ~  0(1  ~  (P  ~  0 /™)}*  ~  P(P  ~  *1  p) 

for  some  1  <  i  <  p  -  mo} 

— »  Pro{5,  >  *77  +  x{D(t*  for  some  »  >  1), 

where  r)  =  6o/[2{Z?(<“, 1  -  and  S,  was  given  in  Lemma  2.4.1.  Therefore  this 

conditional  probability  is  approximately  the  same  as 

Pr_,,{5*  >  y  for  some  t  >  1}, 

where  5*  is  a  partial  sum  of  the  i.i.d. random  variables,  each  having  mean  -rj  and  variance 
1  and  y  =  x{D{t‘ ,tu)}$ .  Following  the  argument  in  Siegmund  (1986,  Ch  VIII),  this 
probability  can  be  approximated  by  i/(2T})exp[-2Tjz{D(t*, <*)}£],  which  can  be  used  as  an 
approximation  to  the  conditional  probability  in  (2.23).  | 


Theorem  2.4.3. 


Suppose  that  6  —  oc,  p  — *  oc,  m  —  oc  in  such  a  way  that  b/y/m  —  60*  and  p/m  — 
t*.  Then  as  m  —  oc, 


Pr„{Afi  >  b)  ~  [1  -  *(7)] 


+  m_3<p(  7) 


2u(2  r,) 


^{<*(1  -  t*)D(t*,t*)}5 


_ ^(2r?) _ 1 

mi(60  +  £{r(l  -  t*)Z)(r,f)}s  )J  ’ 

(2.27) 


where 

7  = 


and  D  and  tj  are  given  in  Lemma  2.4.2. 


Proof :  Note  that 


Pr^To  <  p)  S  Pr {fi){r+ 


<p}  +  Pr(;>{r-  <  p)  -  Pr(;){r0+  <  p)  Pr ('>{r0-  <  p). 


Using  Lemma  2.4.2  and  the  fact  that  the  major  contribution  to 
p}Pr{f/^(p)  €  d£}  comes  from  the  probability  of  crossing  tl. 


rb{»{ i-*/"*)}*  p>>{T0  < 
c  upper  boundary  by  the 
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process  U^(i)  conditioned  on  U^(p)  =  £  which  is  close  to  the  upper  boundary,  we  get 


/ 

~L 


t>{p{l-p/m))$ 

6{p(l-p/m)}i 

t>{p(l-p/m)}i 


Pr<<,){r0<p}Pr{i/A(p)e^} 

exp[-2»7{6{p(l  -  p/m))t  -  £}  W,  *")}*]  Pr {^(p)  €  d*} 


Using  a  similar  approximation  for  the  other  conditional  probability  on  the  right-hand  side 
of  (2.27),  and  evaluating  the  integral  in  (2.22)  asymptotically  as  m  — ►  oo,  we  obtain  (2.25). 


Remark  2.4.  When  x3  —  f(j/m)  for  some  integrable  function  /  such  that  /( 0)  =  0 
and  /( 1)  =  1,  we  get  the  same  result  but  with  D(t',t’)  defined  in  Lemma  2.2.6. 

Table  6  shows  the  approximated  powers  of  the  statistic  (2.1)  when  the  x;'s  are  uni¬ 
formly  spaced.  For  each  case  of  a  sample  size  m=20  and  m=40,  one  sided  significance 
level  0.025  and  0.5  are  considered.  A  Monte  Carlo  experiment  was  performed  and  shows 
that  the  approximation  given  in  Theorem  2.4.3  is  accurate  enough.  The  forth  column  of 
Table  6  involves  the  LRS  with  different  choices  of  6  and  p.  And  the  sixth  column  involves 
the  MLRS.  Roughly  speaking,  the  unmodified  LRS  and  the  modified  LRS  perform  about 
the  same,  but  it  can  be  seen  that  the  modified  LRS  with  m0  >  1  and  mj  <  m  -  1  has 
power  which  improves  over  the  unmodified  LRS  at  points  except  those  close  to  0  or  m. 

To  find  a  confidence  region  for  p,  the  method  of  Cox  and  Spijd>tvoll  (1982),  discussed 
in  Worsely  (1986)  can  be  used.  Let  the  confidence  region  D0  contain  all  change-points  p 
that  partition  the  sequence  into  two  subsequences  in  which  we  accept  the  hypothesis  of 
no  further  change-points  at  level  o.  Consider  the  tests  for  a  further  change-point  in  two- 
subsequences; 

H~ ' p  :  q(1)  =  •  •  •  =  against 

Hy  p  :  3  1  <  Jt  <  p  such  that  qU)  =  . . .  =  Q(*l  ^  =  •  •  •  = 

and 

H+p  :  «(p+1)  =  •••  =  o(m) 


against 
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H\,p  '■  3p+l<fc<m  such  that  o(<,+1)  =  ■  •  •  =  oS^  /  a(<t+1>  =  ■  •  •  =  a(m). 

If  both  of  Hq  p  and  H£ ' p  are  accepted  at  the  combined  level  a,  then  we  put  p  in  Da. 
Let  be  the  equivalent  of  the  test  statistic  A/x  evaluated  only  for  the  subsequence 
of  observations  (x2,  pi), . . . ,  (xp,  yp)  and  let  be  the  equivalent  of  the  test  statistic 
M\  evaluated  only  for  the  subsequence  of  observations  (xp+1, yp+i), . . . ,  (xm,  ym)  Define 
Pr{M,-p  <  b)  =  G~(b)  and  Pr{A/+,  <  b}  =  G+(6).  Then 

Pr{A/-p  <  bi  and  Af+,  <  M  =  Pr{M"p  <  6t}  Pr{A/+p  <  62} 

=  G;(6j)G+(62) 

and  so  an  exact  (1  -  a)  confidence  region  for  p  is 

Da  =  {p:G;(M^)Gt(M+p)<l-a). 

Asymptotically  as  m  —  oc,  and  p  —  oc,  G~( •)  and  G+(-)  has  the  same  formula  and  can 
be  obtained  from  (2.16). 

In  the  rest  of  the  section,  mathematical  results  about  the  confidence  set  of  the  change 
point  are  stated  and  the  related  problems  will  be  discussed.  Suppose  that  we  observe 
yi, . . .,  ym  and  x:  =  j /m  for  j  =  1, . . .,  m,  that  the  hypotheses  of  exactly  one  change  only 
in  the  intercept  of  the  regression  line  is  true,  and  that  Qi,o2.  and  (3  are  unknown  nuisance 
parameters.  Then  the  likelihood  based  cc  'if.dence  set  for  a  change  point  can  be  defined 
as  follows. 

For  1  <  mo  <  m\  <  m  and  c  >  0,  define 

A(p,  c)  =  {  max  [Um(i/m)]2  -  [Um(p/m)]*  <  c}, 

mo<i<mi 

where  Um  is  the  process  defined  in  Section  2.1. 

Although  the  unconditional  probability  of  A(p,c)  depends  on  both  p  and  (a2  -  Q\), 
inference  can  be  made  free  of  (o2  -  oj)  if  we  condition  on  the  sufficient  statistic  U^(p)  = 
Um{p/m){p(  1  -  p/m)}?  =  Thus  in  principle  c  =  c(a,p,£)can  be  determined  by 


Pr{A(p,c)|r;(p)  =  o  =  (1-Q), 


(2.28) 
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where  q  is  a  significance  level  of  the  test.  Then  the  set  of  all  p  such  that  the  sample  path 
{lrm(j/™),j  =  ^0.  •  • mt}  belongs  to  .4{p,c(a,p,  U^(p))]  is  a  (1  -  a)100%  confidence  set 
for  p.  Then 


a  =  Pr{  max  \Um(i/m)\  >  b  \  U^{p)  =  £}, 
where  6  =  {[c(a,p,£)]2  + 

=  Pr^To  <  rm) 

=  Pr^jTo  <P}  +  Pr (£p){2\  >  p)  -  Pr ({p){T0  <  p  and  T,  >  p}, 


(2.29) 


where  To  and  T\  are  defined  in  (2.22).  Since  the  third  conditional  probability  in  (2.29) 
is  negligible  comparing  to  the  first  and  the  second  probabilities  which  are  usually  small, 
in  order  to  get  a  confidence  set  it  suffices  to  find  an  approximation  to  Pr^p){T0  <  p). 
This  conditional  probability  depends  on  how  big  the  difference  between  the  conditioned 
value,  £,  and  the  boundar^  value  at  the  change  point,  ±&{p(  1  -  p/m)}s.  In  this  section, 
we  consider  the  confidence  set  when 

A  =  6{p(l  -  p/m)}5  -  i  =  O(m). 


P  =  Pr^To  <  p} 

=  £  Pr^{To  =  n} 
n=mo 

=  5Z  /  Pr-{i  ^  bik(l  -  */™)}3,for  ail  <  *  <  nl  I  tm!/)(n)  I  =  y} 

n=mo 

X  Pr(p){|  r;(n)  I  €  6{n(l  -  n/m)}j  +  dx }, 

where  U^^(k)  is  the  process  L'^  ^k)  conditioned  on  T„(p)  =  £,andy  =  b{n{  l-n/m)}?  + 


Lemma  2.4.4. 

Given  that  U^^{n)  =  y,  as  m  —  oc ,  n/m  — >  3, p/m  — *  =  m£o-  and  for  a  fixed 

k,  [t:^'(p)(n  -  k)  -  p*p)(n  -  k  \  n)]{D(s,s)}t  is  distributed  approximately  as  sum  of 
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i.i.d.  random  variables  each  of  which  has  mean  0  and  variance  1.  where 

^’(fcln)  =  *[C1(*,n,p)£0+  C2(k,n,p)y/n], 

r  (1  -  n/m)[r(fc,p)- r(fc,  n)r(n.p)] 

1  ,Tl'P  [1  -  (n/m)  -  r2(n,p)(l  -  p/m)n/p}p/m' 

(1  -  n/m)r(k,n)  -  r(k,  p)r(n,  p)(l  -  p/m)n/p 

2  'n'P  ~  [1  -  (n/m)  -  r2(n,p)(l  -  p/m)n/p] 

r(n.p)  =  D(njm,  p/m)/ {D{n/m,  n/m)D(p/m,p/m)}^ 

D(n/m, p/m)  =  1  —  3(1  —  n/m)(p/m)  for  n  <  p. 

Proof.  Since  V^^(n)  is  distributed  as  N  {p(n\p),  o\n\p)),  where 
p(n\p)  =  £rm(n,p)n/p, 

<r2(n|p)  =  n(l  -  n/m)  -  {rm(n,p)}2(l  -  p/m)n2/p. 
it  can  be  shown  that 

p{p)(k\n)  =  E[UmJ’\k)\UlM(n)  =  y} 

=  fc[Ci(fc,n,p)£0  +  Ci{k,n,p)y/n), 

[^(*|n)]2  =  Var  [  I  C!?^)  = 

=  fr[((fc,p)  -  {Ci{k,n,p)}7(;(n,p)k/n}, 

where 

C(*\p)  =  1  -  (k/m)  -  r^(fc,p)(l  -  p/m)k/p. 

Then  direct  calculation  implies  that  as  m  — *  oo, 

[<r^(n  -  k  |  n)]2  -»  k/ D(s,  s), 

and 

Cov(C?(n  -  fcO,  C(;»(n  -  *a)  I  ££?(»)  =  »1  -  mi^fc^fcaJ/IX-,*). 


Hence  the  proof  is  completed.  | 


.'/•  ky.viT.ViV: 
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Lemma  2.4.5. 


Suppose  that  b  —  oc,  m  — ■  oc,  in  such  a  way  that  n/m  — ♦  $,  p/m  — >  t ,  and  b/y/m  - 
bo-  Then 

Pb(n,x)  =  ?T{UmJf(k)  <  b{k(\  -  Jt/m)i}  for  all  mo  <  it  <  n  |  U^\n)  =  y} 

-  PrM{S*  >  x{D(s,s)}5,  for  all  k  >  1}, 
where  Sjt  is  the  sum  of  k  i.i.d.  random  variables  with  mean  p  and  variance  1,  and 
P  —  B\  —  Bito, 

B\  =  60{£>(s, s)}2 /[2{a(l  -  a)}s], 

Bi  =  3(1  -  s)(s/t  -  l)n/{man,p)D(s,3){D(t,t)}}], 


D(ti,ti)  =  1  -  3<2(1  -  <i)  for  tj  <  t-2. 


Proof.  Note  that 


Pb(n.x)  =  Pr {U'Jf(k)  -  ^P)(*|n)  <  Bb(nyk),  for  all  m0  <  it  <  n  |  U'Jf(n)  =  y}, 
where 

Bb(n,k)  =  b{k(l  -  fc/m)}*  -  /^(fc|n). 

Since  the  joint  distribution  of  {[U^^in  -  k)  -  p^P\n  -  k\n)]{D(n/m,n/m)}i ,k  = 
l,...,n  —  mo}  given  that  U^\n)  =  y  converges  to  the  unconditional  joint  distribution 
of  {Sk,k  =  l,...,n-m0}  and  Bb{n,n-k){D(n/m,n/m)} 2  ~  k[Bi  -  Bj^o]  -  *{D{s, s)}? , 

Pb(n,x)  =  Pro{5'*  <  -  J?2£03  -  *{Z?(s,s)}5,  for  all  k  >  1} 

=  Pr_M{S*  <  -x{D(s,  a)}2 ,  for  all  k  >  1} 

=  Pr„{5*  >  i{Z?(3,  j)}^,  for  all  k  >  1} 

=  {rjS.J}-1  PrM{5T+  >  *{£(*,  s)}i},z, 

where  r+  =  inf {k  :  k  >  1,  S*  >  0}  and  the  last  equality  holds  by  the  argument  in 
Siegmund  (1986,  Chapter  XIII).  | 


2*' 

[k 


4 

?»' 

&< 
1,1 


k 


1 


WWW 


Section  2.4:  Powers  and  Confidence  Regions  39 


Theorem  2.4.0. 

Assume  that  b{px  -  p/m)}$  -  £  =  0{m)  and  b  —  oc,  n  —  oc,  p  — .  oo,  and  m  —  oc, 
in  such  a  way  that  b/y/m  —  60,  n/m  —  s,  and  p/m  — *  t.  Then 

Pr^T  <  p} 

|  /  oo 

-  53  /  exP[_<f(n'^)i/<7(n^)]Pre{-5r+  > 

n=mo  *^° 

where 

^(*«P)  =  <Kd(n,^)M£j5T+]]}-\ 

<*("iP)  =  [h{n(l  -  n/m)}£  -  /i(n|p)]/<r(n|p). 

Proof.  Note  that 

Pr(4p){T0  <  p} 

=  Pr<('’){ro+  <  p}  +  <  p}  ~  pr({p){r^  <  p  and  r~  <  p}. 

Since  for  £  >  0, 

Pr(('>{7-„  <  „}  S  P,';>!r*  <  ,} 

=  £  =  ") 
n=m  o 

=  £  /5°A(n,x)Pr(f){^m(n)€Hn(l-n/m)}i+(ix}, 

n=m0  ^ 


and  similarly 

Pr'l^To  <  p}  S  Pr(!>  {r0-  <  p}, 


by  Lemmas  2.4.4.  and  2.4.5  and  letting  m  oo, 

Pr(cP){To  <  P} 

-  53  /  {Pr«{sr+  >  i{Z?(s,s)}J}/£H[5T+]}dr{<&(d(n,p)+  i)/o(n|p)} 

n=mo 

,00 

-  53  /  Pr  w{^t+  >  i{Z?(s,  s)}}  exp[-d(n,  p)i/(r(n|p)]di/?(n,  p), 

•'o 


nrmo 


where  d(n,p)  and  R(n.p)  are  defined  above.  | 
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Remark  2.5.  If  d(nm ,  p)/[<7{n‘\p){D{n’ /m,n’ /m)}$]  =  2p  for  some  n*  at  which  the 


integration  has  the  biggest  contribution  to  Pr^{7b  <  p},  then  Pt^'{Tq  <  p }  can  be 


reduced  to  £  Pr^{7o  =  n},  where  the  summation  is  over  n  which  are  dose  to  n*,  and  to 
a  further  simpler  form  by  the  argument  in  Siegmund  (1986, Chapter  IX). 


1 


Chapter  3 


Change  in  Both  Intercept  and  Slope 


3.1.  Models  and  Test  Statistics 

In  Chapter  3,  the  two-phase  linear  regression  model  in  which  both  intercept  and 
slope  terms  change  will  be  considered.  Quandt  (1958)  introduced  this  model.  He  pro¬ 
posed  the  LRT  to  test  for  this  type  of  two-phase  regression  model  as  opposed  to  the  null 
hypothesis  of  the  simple  linear  regression  and  observed  that  the  LRS  doesn’t  follow  the 
standard  maximum  likelihood  asymptotic  theory.  This  type  of  two-phase  regression  model 
has  many  applications  in  econometrics,  biology,  quality  control,  and  so  on.  Brown,  Durbin 
and  Evans  (1975)  give  three  examples  involving  growth  in  the  number  of  local  telephone 
calls,  the  demand  for  money,  and  staff  requirements  of  an  organization.  They  use  recursive 
residuals  to  study  the  stability  over  time  of  regression  relationships  and  discuss  Quandt's 
likelihood  method.  Hinkelv  (1971)  studies  a  small  set  of  data  obtained  from  replicated 
experimental  determination  of  the  relationship  between  blood  factor  VII  production  and 
wafarin  concentration.  He  applies  a  broken  line  regression  model  with  a  continuity  con¬ 
straint  to  this  set  of  data.  The  same  kind  of  example  appears  in  Haddad,  Jeng,  and  Lai 
(1987)  who  use  a  two-phase  regression  model  to  summarize  the  time  course  and  change  in 
heart  rate  during  respiratory  pauses  in  puppies  and  young  adult  dogs. 

We  consider  the  problem  of  testing  the  null  hypothesis  that  the  data  follow  one  simple 
linear  regression  : 

Ho  :  y>  =  a  +  0x3  +  z, ,  j=l,...,m,  against 


the  alternative  hypothesis  that  there  is  a  change  both  in  the  intercept  and  slope  : 
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H\  :  3  1  <  p  <  m  such  that 

V:  =  Qi  +  fax,  +  £]>  j  =  1 - ,P 

y3  =  a2  +  fax,  +  £j,  j  =  p+  1 . m, 


where  Oj  ^  Q2  and/Jj  ^  fa. 

Unlike  Hinkely’s  model,  we  do  not  assume  mathematical  continuity  of  the  two-phase 
regression  line  and  we  suppose  that  a  change  happens  at  the  pth  data  point  as  in  Chapter  2. 
In  this  section,  we  assume  that  the  e/s  are  independently,  identically  normally  distributed 
with  mean  0  and  variance  a 2  and  we  derive  the  LRS  for  cases  of  known  and  unknown  a 2 
and  study  the  null  and  alternative  distributions  of  the  LRS.  When  a2  is  known,  a2  can  be 
assumed  to  be  equal  to  1  without  loss  of  generality.  Then  -21og(likelihood  ratio)  statistic 
for  a  fixed  change  point  i  is  proportional  to 

[: Vm  -  y,]2mi/(m  -  i)  +  [< Qly,JQxx.i  +  7Q*r,i]  -  [Qxy2m/Qrx.m],  (3.1) 


where 


x.  =  (^Ij)/i, 

7  =  1 
t 

y,  =  (X^j)/*’ 

j=i 

« 

Qxx.x  =  ^  (Xj  —  X{)  , 

7  =  1 

t 

Qxyj  -  53  (x>  -  x,){y j  -  y,), 

7  =  1 


x*  =  (  E  0» 

7=i  +  l 
m 

y*  =  (  E  “  *)’ 

7  =  <  +  l 

Q’xx,  =  E 

7=*+l 

Qxy.i  =  E  ~  )(»>  ~  »*)• 

7  =  .  +  l 


To  get  some  insight  about  the  distribution  of  (3.1),  we  can  rewrite  (3.1)  as 


Vm(i/m)  ||2  =  S'Z-'t,, 
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where 


6,  =  (dl-d*Jt-/9D, 

<*.  =  y,  -  4.*,,  a,*  =  y*  -  A* *< » 

4,  =  Qzy,i/Qxx,i'  4,  —  Qzy.i/Qzz.i ' 

v  _  / m/[:(m  -  i)]  +  (i?/Q,*.,)  +  (i?2/Q*x.,)  *./<?«,.  +  *‘/Qlz.> 

V  *i/Qzx,t  +  Z, /Q rX'i  l-IQzz,’  +  1/Qxx,i 

Hence  the  likelihood  ratio  test  (LRT)  of  /?o  against  H\  can  be  based  on 


max  ||  Vm(i/m) 

l<t<m 


As  in  Chapter  2,  we  shall  consider  the  modified  LRS 


M3  =  max  ||  Vm(t/m)  ||  , 

mo<i<mi 


where  1  <  mo  <  mj  <  m.  Based  on  the  MLRS,  /?o  is  rejected  for  a  large  value  of  M3  and 
the  value  of  i  which  maximizes  ||  \m(i/m)  j|  is  the  maximum  likelihood  estimate  of  the 
true  change  point.  Under  Ho ,  6,  has  a  bivariate  normal  distribution  with  mean  (0,0)  and 
covariance  matrix  E,  for  each  i  and  so  the  null  distribution  of  M3  is  the  maximum  of  a 
sequence  of  correlated  chi-square  random  variables. 

Here  is  another  expression  of  the  LRS  which  will  be  used  in  Sections  3.2  and  3.3  : 

||  VTO(i/m)  ||2  =  [V,,m(«/m)]2  +  [U2,m(./m)]2,  (3.3) 


where 


v, ,m(i/m)  =  [a\(x\' jqf 

V2.m(i/m)  =  \A'2{X'X\)~'  X'Y)l[A'2(XtX\)~'  A2)\ 


A\  =  (1.  -1.0), 


a'2  =  (0,1, 0,-1), 
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10  X,  ’ 

1  Xi  0  0  | 

1  0  Xi 

,  X*  = 

O 

o 

H 

• 

0  1  x1+i 

0  0  1  x,+i 

,  0  1  xm 

,  0  0  1  xm 

y'  =  (yi 

From  this  representation,  one  sees  that  Vj  m(t/m)  is  the  same  statistic  as  Um(i/m) 
defined  in  Section  2.1.  That  is,  VUm(i/m)  is  the  LRS  to  test  that  only  the  intercept  changes 
as  opposed  to  the  null  hypothesis  of  no  change.  On  the  other  hand  V2rn(i/m)  is  the  LRS 
to  test  the  null  hypothesis  that  only  the  intercept  term  in  the  regression  line  changes  at 
the  point  i  against  the  alternative  hypothesis  that  both  intercept  and  slope  change  after 
the  same  point  i.  It  is  easy  to  show  that  each  of  (Vltm(i/m)]2  and  [V2,m(i7m)]2  has  a  chi- 
square  distribution  with  1  degree  of  freedom  and  the  covariance  function  of  the  process 
{Vi,m(t/m),  t  =  1,.  ..,m}  was  given  in  Section  2.1.  For  the  process  V2<m  the  covariance 
between  V2,m(i/m)  and  V2_m(j/m)  for  i  <  j  is  given  by 


Cov  [  V2,m(i/m),  V2,m(j/m)  ] 


_  f  Qxx.>  Q  zx,:  1 

l  Q'xx,xQzzo  J 


Dm(i/mJ/m) 


{D  i/m)Dm(j /m,j / m)}2 


r  ,  (3.4) 


where 


•  Dm(i/m,j /m)  =  1  -  (xm  -  x,)(xn  -  x,)mjl[{m  -  j)Qzx]  for  i  <  j. 

It  is  convenient  to  introduce  the  notations 

An  =  Cov  [V,.m(*/m),  A12  =  Cov  V2,m(j/m)] 

A2i  =  Cov  [Vi,m(j/m),  V2,m(t/m)],  A22  =  Cov  [V2im(i/m),  V2,m(ji/m)]. 

One  delicate  matter  is  the  cross  covariance  between  the  two  processes  Vlim  and  V2 ,m-  It  can 
be  easily  checked  that  Vri,m(i/m)  and  V2,m(i/m)  are  independent  at  each  point  i.  However 
for  different  points  i  and  j  such  that  i  <  j  covariance  function  is  as  follows: 
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miQ'x  ,.j 


(*.  "  *>) 


(m  t)Qxi,mQrr,j  J  {Dm(i/m,  i/ m)Dm(j / m,  j / m)}> 

x  _  f  m(m  -  j)Qxx,t\  * _ (£2  ~  £T) _ 

\  jQxx,mQrx,t  J  {Dm(i/m,i/m)Dm(j/m,j/m)}^ 


In  summary, 


where 


Cov[  Vm(i/m),Vm07rn)]  = 


/2  A 
A  /2 


An  Ai2 
A2i  A22 


Thus  {Vm(i/m), t  =  l,...,m}  is  the  two  dimensional  stochastic  process  with  zero 
drift  and  the  covariance  function  given  above.  Again  the  null  distribution  of  M3  depends 
on  the  Xj's  only  through  this  covariance  structure  of  {Vm(t'/Tn)},  not  on  ct,  (3.  Under  the 
alternative,  Vm(t/m)  has  a  bivariate  normal  distribution  and  the  covariance  structure 
remains  the  same  as  under  the  null  hypothesis.  So  the  only  difference  of  the  LRS  under 
H\  is  non-zero  drift  of  {Vm(t/m)}.  For  convenience  we  use  the  notation 

Aq  =  Q2  -  Qi ,  Ag  =  02  ~  0\  ■ 

Then  under  H\ ,  Vliin(t/m)  has  non-zero  mean  for  all  i,  which  is 
£  [V^i/m)] 

.  [(1  -  p/m)A,,p{  Aa  +  Agxm)  -  Ag(x,  -  x)Q'IX  J  . 

=  j  - - - - j - ,  t  <  p 

{i(l  -  i/m)Dm{i/rn,i/m)Qxx,my 
Up/m)A,,p{ Aa  +  Agxp)  -  Ag(x*  -  i)<?ix,P] 


/m)Dm(i/m,  i/m)Q  xr.m} 


’Qxx.m  Dm(i/m,p/m). 
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And  for  V2jm(i/m), 


E[V2im(i/m)} 

_  f  Qxz,,  1  -  ?)(*.*  -  ^){A°  +  A^;}  -  rj 

1  {Dm(t/m,t/m)}» 

—  /  ^xz,i  1  J  [p(^«  ~  *p){^a  ~b  ~  &qQx i,p] 

{Z)m(t7m,t/m)}> 


t  >  p. 


So  the  alternative  distribution  of  M3  depends  on  unknown  parameters  a2  -  o1(  /?2  -  /3i 
and  the  unknown  change  point  p. 

If  <r2  is  unknown,  the  LRS  is  proportional  to 


max  ||  Vm(t/m)  ||/<r 

1  <i<m 

where  d2  =  (<?yy,m  -  Qly,m/Qxx.m)/tn.  Thus  the  modified  LRS  is 


M4  =  max  ||  Vm(t'/rn)  ||  /  0. 

mo<x<m\ 


In  the  following  sections,  similar  kinds  of  results  as  in  Chapter  2  will  be  discussed. 
We  study  the  asymptotic  behavior  of  the  MLRS  under  Hq  for  the  cases  of  known  and 
unknown  variance.  In  Section  3.3,  we  derive  an  approximation  to  the  significance  level 
of  M3  and  present  simulation  results  which  support  the  analytical  approximation  derived 
for  known  variance  case,  and  show  that  this  approximation  can  be  applied  for  unknown 
variance  case. 

3.2.  Asymptotic  Behavior  of  Test  Statistics 

In  Chapter  2,  it  was  seen  that  the  MLRS  converges  to  the  maximum  absolute  value 
of  functions  of  Brownian  bridge  processes  or  Gaussian  processes  according  to  the  random 
or  fixed  x:' s,  respectively.  In  the  case  where  both  intercept  and  slope  change,  we  shall 
obtain  similar  results  which  are  extensions  of  those  of  Section  2.2.  As  we  can  guess  from 
the  form  of  the  MLRS,  the  limiting  distribution  is  the  maximum  norm  of  random  functions 
involving  two-dimensional  Brownian  bridges  or  two-dimensional  Gaussian  processes. 
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Section  3.2.1  concerns  the  case  in  which  the  independent  variable  is  random.  Also,  the 
asymptotic  behavior  of  the  MLRS  is  considered  conditionally  on  the  x/s.  As  in  Chapter 
2  we  obtain  the  same  limiting  distributions  whether  we  consider  the  null  distribution  of 
the  MLRS  conditionally  or  unconditionally.  We  will  deal  with  the  case  of  the  fixed  values 
of  the  independent  variable  in  Section  3.2.3.  The  limiting  behavior  of  the  MLRS  under 
a  mild  assumption  about  the  values  of  the  independent  variable  will  be  studied,  starting 
from  the  case  where  the  values  of  the  independent  variable  are  uniformly  spaced.  Although 
the  MLRS  was  derived  assuming  that  the  s/s  are  identically  and  normally  distributed,  the 
asymptotic  results  to  be  discussed  in  Sections  3.2.1  and  3.2.2  do  not  require  this  normality 
umption. 

3.2.1.  When  the  independent  variable  is  random 


This  section  will  show  similar  results  as  in  Section  2.2.1  using  Donsker’s  theorem  when 
the  independent  variable  is  also  random.  \.s  in  Section  2.2.1,  it  can  be  easily  checked 
that  the  MLRS  does  not  depend  on  the  slope  under  the  null  hypothesis.  This  implies  that 
we  can  take  x  as  the  random  variable  which  is  independent  of  y  when  we  study  the  null 
distribution  of  the  MLRS.  The  following  theorem  is  on  the  convergence  in  distribution  of 
the  MLRS  when  o2  and  o 2  are  known.  In  this  case  we  may  assume  o2  —  o2  =  1  without 
loss  of  generality. 


Theorem  3.2.1. 

Let  (xi,  yi), . . .,  (xm,  ym)  be  i.i.d.  random  variables  such  that  £[xi}  =  £(j/i]  =  0, 
£[x!2]  =  E[yi2}  =  1.  and  ffiiyi]  =  0. 


Under  Hq ,  as  m  — •  oc  and  m,/m  — *  t{  for  i  =  0, 1, 


M3 


max 

mo<t<mj 


II  Vt(‘»  li 

{(t/m)(l  -  i/m)}5 


max 


II  w°(p  || 

{*(1-0}* 


in  distribution. 


where  V®, (i/m)  =  Vm(i/m){(i/m)(l  -  i/m)}*,  W°(l)'  =  (W?(t),  W°(t)).  W?  and 
are  two  independent  Brownian  bridge  processes. 


« 
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Proof  :  Note  that 

l|V^7m)||2  =  [V£m(*7m)]2  +  Km(i/m))\ 

where 

vi°m(i/m)  =  [By(i/m)~  Bz(i/m)(QIyim/QrXiTn)]/ {Dm(i/m,i/m)}\ 
V2°m(i /m)  =  {l/Qxr.,  +  1  /Qlx,V*  {Qxy,/Qxx.t  -  QlyJQl;.,}, 

Dm{i/m,  i/m)  =  1  -  [Bz(i/m)]2 /[(i/m)(  1-  i/m)<3rr.m], 

Bz(i/m)  =  (x,  -  xm)i/y/rri.  By(i/m)  =  (y,  -  ym)i/y/m. 

(i)  Let 

Ym(i/m)  =  By{i/m)/{Dm(i/m.i/m)}\ 

Xm(i/m)  =  Bz(i/m)(Qxy_m/Q 

xr,m  )/{Dm(,/m,«/m)}*, 

so  that  Vimi*/™)  —  Ym(i/m )  —  Xm(i/m).  In  Theorem  2.2.2,  we  have  shown  that 
Ym  —  in  distribution  and  Xm  — ♦  0  in  probability. 


which  leads  to 


V'j°m  — ►  M^°  in  distribution. 


(ii)  Note  that  V^^i/m)  can  be  rewritten  as 


(Qxx.,li)(Q’zzJ{m-  i)) 

,( Qxx.m/m )  -  [Zx(i/m)]3/(i(l  -  i/m)). 


<  Vrx.t/1 


{i/m}" 


Qly.il  t/™'-  \\ 

Q’xx.i/i171  ~  *)/ 


Since  V2m(i/m)  is  a  function  of  the  partial  sum,  x^y,,  Donsker’s  Theorem  can 
be  used  to  show  that  V2°m  —  W%  in  distribution  in  the  following  way.  Let  Wm(i/m)  = 
Qxy,i/y/m.  First  we  use  the  convergence  of  Wm  to  the  Brownian  motion  W2  to 
show  that  for  any  sets  of  (rj, . .  .,rn)  and  (tj, . . .,  t„)  such  that  (t'i/m, . . . ,  t„/m)  — 
as  m  — >  oo, 

n  ** 

£[exp{i  Vr*V2°m(i*/m)}]  =  T[exp{t  ^(c,tQ.v  ,k  +  )}] 


I  <■  T  j.1‘4  i 
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n 

-  £[exp{i  53r*Hf(<fc)}]  in  distribution, 
k=l 

where  cu  and  c*  are  appropriate  coefficients.  This  implies  that  the  finite  dimensional 
distributions  of  V2°m  converge  to  those  of  W%.  Secondly,  the  tightness  of  the  sequence 
{V2°m}  follows  from  the  same  sort  of  argument  involved  in  the  proof  of  Donsker’s 
theorem  (Billingsley, 1968).  Hence 

V2°m  — ■  Wj  in  distribution. 

(iii)  Since  V2°m  and  Ym  are  independent  and  Xm  — »  0  in  probability,  it  is  easy  to  show 
that  (Xm,ym,V2°m)  —  (0,  VU°,  VU°)  in  distribution. 

Then  by  the  continuous  mapping  theorem  the  proof  is  completed.  | 

As  pointed  out  in  Section  2.2,  we  obtain  the  same  limiting  distribution  as  in  the 
preceding  theorem  when  the  variances  are  unknown. 

Corollary  3.2.2. 


Under  the  same  assumptions  as  in  Theorem  3.2.1, 


A/4  =  max 


mo^mi  {(*/m)(  1  —  i/m )} 


2  t0<t<ti 


II  w°(p ; 


—  in  distribution. 


where  V°  (i/m)  =  V°  (i/m){(t/m)(l  -  i/m)}?. 


In  the  following  theorem,  the  asymptotic  behavior  of  the  ML.RS  will  be  considered 
conditionally  on  the  x/s  when  the  x/s  are  a  random  sample  from  some  distribution. 

Theorem  3.2.3. 

Let  z'  =  (xj,  y:),  j  =  be  a  sequence  of  i.i.d.  random  vectors  such  that 


E[zj]  =  pt  and  £[z;z']  =  £  = 


'xy  Uy 


Under  H 0,  as  m  —»  oc  and  m,/m  —  t,  for  :  =  0, 1,  conditionally  given  xi.i2,  •  ■ 

A/,  =  max  l!yaC7m)||  _  ^  in  distribution. 

*><*<**  {f(1  _ 


ft; 

jfi 

I 


B. 


Ws 
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for  a.e.  ii,x2,  •••,  where  W°(f/  =  (W^(f), WjfO)  and  ^i°  and  W$  are  two  inde¬ 
pendent  Brownian  bridges. 

Proof  :  To  prove  this,  we  follow  the  same  argument  as  in  the  proof  of  Theorem  3.2.1.  In 
the  proof  of  Theorem2.2.4,  we  wrote  Vfm  =  Zm  -  Rm,  where  Zm  and  Rm  were  defined  in 
Theorem  2.2.4,  and  showed  that  a.e.  in  x 

(i)  Zm  — *  in  distribution, 

(ii)  max  Rm(ifm)  — *  0  in  probability. 

Then  similar  arguments  show  that,  a.e.  in  x,  as  m  — *  oo, 

Vj ?m  — *  W°  in  distribution, 

and  hence 


(Zm,  max  Rm(i/m),V2im)-.(W°,0,W°)  in  distribution. 

mo<i<mi 

Therefore  by  the  continuous  mapping  theorem,  the  proof  is  completed.  The  independence 
between  H'j°  and  can  be  proved  examining  the  limiting  behavior  of  the  covariance 
functions  given  in  (3.5)  and  (3.6).  | 


Corollary  3.2.4. 


Under  the  same  assumptions  as  in  Theorem  3.2.3,  conditionally  given  ii .  X2, . . . , 


M4 


max 

mo<t<mj 


II  v°(»7m)  H 

{(i/m)(l  -  i/m)}5 


max 


Ii  w°(p  || 

{<(i  -  *)}* 


in  distribution, 


for  a.e.  Xi,x2,...,  where  V°  (i/m)  =  V°(i/m){(t/m)(l  -  i/m)}*. 

In  the  above  Theorem  3.3.3,  VTj°  and  Wj  are  independent  since  x,  -  x:  — *  0  and 
x*  —  x*  — *  0  as  i,j  —*  oc.  Thus  the  MLRS  may  not  converge  to  the  limiting  distribution 
given  above  if  the  empirical  distribution  of  the  x/s  does  not  satisfy  these  conditions.  We 
will  discuss  this  more  carefully  in  the  following  section  describing  how  covariance  structure 
depends  on  the  spacings  of  the  x/s. 
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3.2.2.  When  the  independent  variable  is  fixed 

This  section  will  show  that  the  MLRS,  M3  and  M4  converge  to  the  maximum  norm 
of  two-dimensional  Gaussian  processes  when  the  Xj’s  are  fixed.  In  the  preceding  section, 
it  was  seen  that  Vi  and  V2  are  asymptotically  independent  conditionally  on  the  x:'s  when 
the  first  and  second  sample  moments  of  the  i/s  converge.  The  covariance  function  which 
was  given  in  (3.4)-(3.6)  explains  the  effect  of  the  spacing  of  the  x:'s  on  the  distribution  of 
the  MLRS.  In  the  following  theorem  that  gives  the  limiting  distribution  of  the  MLRS,  we 
use  the  representation  of  (Vj0m(i/m),  V£m(t/m))  as 


(5^  fri.fcgjQ. 


where  a,,*  was  given  in  (2.6)  and 


f  (i/wt)(l  -  i/m  )Qzx,iQmZx,t  \  ^  Xk  -  x, 
\  Dm{t/^rt^i/m)Qzt,m  J  Qxx,i 

-  f  (»/m)U  -  ]  *  Jfc  -  x* 

1  i/m)Qxx,m  J  Qxx,i 


k  >  i. 


Here  we  assume  that  a2  is  known  and  hence  without  loss  of  generality  equals  one  and 
begin  with  the  case  in  which  the  x}'s  are  uniformly  spaced. 

Theorem  3.2.5. 

Suppose  that  x}  —  j /m  for  j  =  1 , . . . ,  m. 

Under  R 0,  as  m  — *  oc  and  m,/m  — ►  t ,  for  t  =  0, 1, 

M3  =  max  ||  Vm(i/m)  ||  —  max  ||  V(f)  ||  in  distribution,  (3.7) 
where  V  is  a  two-dimensional  Gaussian  process  with  mean  0  and  a  covariance  matrix, 


Cov[V(t),V(s)]  = 


h  A 
At»  h 


I2  is  an  identity  matrix  and 


An(M)  ^12(1,5) 
A2i(f,s)  ^22(^5) 


$ 

S 
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{Z?(M)£>(3,3)P 


3(1"‘)  i  {D(t,t)D(s,s)}* 


{D(t,t)D(s,s)} 


>  {Z>(l,  *)£(«,  *)} 


D(s,t)  =  1  -  3s(l  -  t)  for  t  <  s. 


Proof :  In  proving  this  result,  we  have  only  to  show  that 


Vm  — »  V  in  distribution. 


To  prove  that  the  finite-dimensional  distributions  of  Vm  converge  to  those  of  V,  it  suffices 

to  show  that  for  any  sets  of  rn)  and  (tj, - i„)  such  that  (ij/m, . . . ,  tn/m)  — > 

as  m  —  oc, 

n 

^(«xp{t  J3(ri.*V1.m(*fc/m)  +  r2.*V,2im(tt/m))}] 

k=\ 

n 

—  £'[exp{t^(r,tfcVi(t1)  +  r2i*V2(tfc))}]  in  distribution, 

k=\ 

which  follows  from  the  same  argument  in  Theorem  2.2.6.  In  Theorem  2.2.6,  we  have  shown 
that  the  sequence  is  tight  and  the  similar  argument  shows  that  the  sequence  {V2,m} 

is  also  tight.  Lemma  2.2.5  now  implies  that  the  sequence  {Vm  =  (Vj.m,  t'Vm)}  is  tight. 
And  hence  Vm  — ►  V  in  distribution.  Therefore  (3.7)  follows  from  the  continuous  mapping 
theorem.  | 


The  rest  of  this  section  is  devoted  to  a  generalization  of  Theorem  3.2.5  to  the  case 
where  x3  =  f(j/m)  for  some  integrable  function  f.  In  fact,  we  need  only  to  figure  out 
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the  limiting  covariance  function  to  find  the  limiting  distribution  of  the  MLRS,  which  is 
described  in  the  following  theorem.  The  proof  will  be  omitted  since  it  is  similar  as  that 
of  Theorem  3.2.5. 

Theorem  3.2.6. 

Suppose  that  x3  =  f{j/m)  j  =  l,...,m,  for  some  integrable  function  f  such  that 
/(0)  =  0  and  /( 1)  =  1. 

Under  Ho,  as  m  —  oc  and  m,/m  —  t,  for  t  =  0,1, 


A/3  =  max  ||  Vm(i/m)  j|  —  max  ||  V(t) 


in  distribution. 


where  V  is  a  two-dimensional  Gaussian  process  with  mean  0  and  covariance  matrix 


(  h  A„\ 

Cov  [V(<),V(«)]=  (  _  , 


{D(t,t)D(s,s)} 


h(s)[D(t,t)  -  h(t)}  ) 
t[D(s.  s)  -  h(s)}  \  *  (1  -  s)g(s)  -  (1  -  t)g(t) 


h(s)(l-t)  J  {D(t,t)D(s,s)y 
h(t)(l  -  s)  1>  sg(s)-tg(t) 


[D(t,t)  -  h{t)] }  {D(tj)D{s,s)}S 
Jo  f(n)du  -  (/0*  f(u)du)/t _ 


{(1  -  Ot/o1  -  (/J  /(«)rf«)2]} 

_  Jo  f7(u)du  -  (J*  f(u)du)2/t 
~  [jZfHu)du-(lZf(u)du)>} 

D(s,  t)  =  1  -  s(l  -  t)g(s)g(t)  for  t  <  s 


['.SS 

[VV. 
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Remark  3.1.  In  Chapter  2,  it  was  discussed  that  covariance  function  depends  on  the 
configuration  of  the  values  of  the  independent  variable  only  through  the  function  g.  In  the 
case  where  both  the  intercept  and  the  slope  change,  one  more  function  h  is  involved  to 
explain  such  a  dependence.  Also  g(t)  =  y/3  and  h(t)  =  t3  when  /(u)  =  u,  which  is  the 
case  in  which  x3  =  j/m. 

When  the  variance  is  unknown,  we  obtain  the  same  limiting  distribution  of  A/4  as 
that  of  Af3,  which  is  stated  in  the  following  corollary. 

Corollary  3.2.7. 

Suppose  that  x3  =  f(j/m)  j  =  1  ,...,m,  for  some  integrable  function  f  such  that 

/( 0)  =  0  and  /( 1)  =  1. 

Under  Hq,  as  m  — *  oc  and  m,/m  — ►  for  t  =  0, 1, 

M4  =  max  ||  Vm(i/m)  ||  — ►  max  ||  V(t)  ||  in  distribution, 

mo  £  *^mi  to<t<t! 

where  V  is  a  two-dimensional  Gaussian  process  defined  in  Theorem  3.2.6. 

3.3.  Approximations  to  Significance  Levels 

Now  our  concern  is  how  to  approximate  significance  levels  of  A/3  and  A/4.  We  follow 
the  basically  same  arguments  used  in  Section  2.3,  extended  to  boundary  crossing  problems 
by  a  discrete  stochastic  process  which  has  two-dimensional  state  space  and  one-dimensional 
time  parameter.  In  Section  3.3.1,  we  give  an  asymptotic  expression  which  can  be  used 
to  approximate  Pr{A/3  <  6}  when  the  x3's  are  random,  using  the  argument  developed  in 
Siegmund  (1986,  Chapter  5).  Then  we  derive  an  approximation  to  the  right-hand  tail  of 
the  distribution  under  Ho  of  Af3  when  the  x3's  are  fixed.  Since  these  tail  probabilities 
are  interpreted  as  significance  levels,  it  is  important  that  they  be  accurate  when  the  true 
probabilities  are  in  the  range  .01  -  .10.  We  perform  Monte  carlo  experiments  and  discuss 
how  accurately  the  asymptotic  expressions  approximate  the  actual  distribution.  Also  it 
will  be  discussed  how  well  significance  levels  of  M4  which  is  the  MLRS  when  o 2  is  unknown, 
can  be  approximated  by  the  asymptotic  result  derived  for  the  known  variance  case. 
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3.3.1.  When  the  independent  variable  is  random 


In  Section  3.2,  we  showed  that 
M3 


W°(t)  ||  .  ..  . ,  . 

max  - -  in  distribution, 


<0^'  {t(i-t)}* 

where  W°  is  a  two-dimensional  Brownian  bridge  process  on  [0,1]  and  m,/m  — ►  t,.  for  i  = 
0,1,  as  m  — *  oo.  In  principle,  we  can  approximate  the  significance  level  of  the  test, 
Pr{Af3  >  6},  by  the  tail  probability  of  this  limiting  distribution.  James,  James,  and 
Siegmund(1987)  give  an  approximation  to 

Pr{  max  |(  W°(<)  ||/{<(1 -<)}*>  *}, 

to  1 

where  W°  is  a  d-dimensional  Brownian  bridge  process.  As  in  Section  2.3.1,  the  approx¬ 
imations  to  tail  probabilities  of  A/3  by  those  of  this  limiting  distribution  are  too  crude. 
Since  the  exact  distribution  of  M3  is  too  complicated,  we  shall  now  consider  an  analogous 
discrete  time  result  as  in  Section  2.3.1.  In  the  following  proposition,  we  derive  an  approx¬ 
imation  to  the  tail  probability  defined  in  terms  of  a  Brownian  bridge  process  observed  at 
discrete  instants  of  time,  which  is  a  generalization  of  (3.12)  in  Siegmund  (1986). 

Let  T  =  inf{n  :  n  >  mo,||Sn||  >  b{n(  1  -  n/m)}$},  where  S„  =  z\  + - 1 -  zn  and  z’s 

are  independently  normally  distributed  d-dimensional  random  variables  with  mean  0  and 
indentity  covariance  matrix.  And  let  Pr^{A}  =  Pr{A  |  Sm  =  0}. 

Proposition  3.3.1. 

Assume  that  b  — »  00,  mo  — < ►  00,  mj  — »  00,  m  — *  00  in  such  a  way  that  for  some 
0  <  <0  <  <  1  and  b  >  0, 


m,/m  — *  t;  for  i  =  0, 1,  and  &2/m  — >  a. 


Then  as  m  — *  00, 


2  ;  6(m~'  -m-1 )  ^ 

Prim){T  <  mj)  -  6rfe-T2<1-?)[r(d/2)]-1  /  .  r"1i/(r  +  a/r)dr  (3.8) 

/fcfmf'-m-')* 
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Proof  :  (3.8)  follows  from  the  extension  of  the  argument  in  the  proof  of  Theorem  3.11  in 
Siegmund  (1986).  | 

For  d  =  2  we  obtain  the  desired  result.  Table  7  indicates  the  accuracy  of  this  asymp¬ 
totic  expression  to  approximate  significance  levels  of  M3  and  M<  when  the  independent 
variable  is  random.  This  asymptotic  expression  gives  a  crude  idea  about  the  significance 
levels  and  improves  the  continuous  approximation  substantially. 

3.3.2.  When  the  independent  variable  is  fixed 

When  the  independent  variable  is  fixed,  the  MLRS  involves  a  two-dimensional  Gaus¬ 
sian  process  with  covariance  function  given  in  (3.4).  Since  the  Gaussian  process  involved 
is  again  non-differentiable  and  non- stationary,  we  follow  the  same  ideas  as  in  Section  2.3.2 
to  derive  an  asymptotic  expression  for  Pr{M3  >  6}.  However  the  situation  is  more  compli¬ 
cated  than  in  Section  2.3.2,  since  we  have  to  deal  with  two  dimensional  Gaussian  process 
whose  coordinates  have  non  zero  covariance. 

The  following  lemma  reduces  this  boundary  crossing  problem  by  Gaussian  process 
which  has  one-dimensional  time  parameter  and  two-dimensional  state  space  to  the  problem 
involving  Gaussian  process  with  one-dimensional  time  parameter  and  state  space,  so  that 
the  derivation  of  an  asymptotic  expression  follows  from  modifications  of  the  calculations 
in  the  one-dimensional  case. 

Lemma  3.3.2. 

Let  {V(t)  =  (V"i(t),  V2(t))}  be  a  two-dimensional  stochastic  process.  Then 
Pr{  max  ||V(i/m)||  >  6} 

=  Pr{  max  sup  [cos  0Vj ( i/m)  +  sin  0V2(t/m)]  >  6}  (3.9) 

mo<«<Tni  0<S<2r 

Proof :  Note  that 

cos  0Vi(i/m)  +  sin  0V2(i/m)  =  ((cos  0,  sin  0).  (Vi(t/T7i),  V2(i/m))) 

=  [j V(*/ m)((  cosu>. 


B? 

& 

S' 


i 


Ki 

njPT 
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where  (•,•)  is  an  inner  product  of  two  vectors  and  u  is  the  angle  between  (cos  0,  sin  9)  and 
(Vj(*/m),  Vj(tYm)).  Then  taking  the  supremum  over  0  <  u  <  2x,  (3.9)  holds.  | 

We  are  now  in  a  position  to  modify  arguments  used  in  Section  2.3.2.  To  begin, 
we  consider  the  case  where  xs  =  j/m.  From  Lemma  3.3.3  through  Theorem  3.3.7,  it 

is  assumed  Xj  —  j/m  for  j  =  1 - ,m,  Zm(i/m,9)  =  cos0Vi,m(»/m)  +  sin 0V2>m(t/m). 

In  Section  2.3.2,  Cov[Um(t  +  h),Um{tj]  =  C(t)|/i|  +  o(h ),  so  that  we  took  the  distance 
between  points  of  the  grid,  h,  as  1/m  to  make  b2h  oc  a.  Note,  however,  that 

Cov[Zm(t  +  h,9  +  6),  Zm(t,9)}  -  \  =  C1(t,0)|/i|  +  C2(t,dM2  +  o(h)  +  o(62). 

Thus  under  the  assumption  that  b2/m  — *  a  as  m  — ►  oo  and  b  — *  oo,  we  take  h  and  6 
such  that  b2h  «  a  and  bb  oc  a,  so  that  h  oc  62.  Hence  we  use  the  normalized  process 
Z^(i,c)  =  b(Zm{t  +  i/m,0  +  c/y/m)  -  6),  where  62/m  -*■  a. 

Lemma  3.3.3. 

Suppose  that  x}  —  j/m  for  j  =  1, . . . ,  m. 

Let  Zm(i/m,6)  =  cos 9Y\im(i/m)  +  sin0V2ifn(»/m),  and 

c)  =  b(Z^(t  +  t/m.  9  +  c/y/m)  -  6),  where  62/m  =  a. 

Then  as  m  — *  oc  and  6  — *  oo, 

£  [  O*, c)  -  x  I  ^£(0, °)  =  x  1  =  *)*'  -  ac2/2  +o(l), 

Cov  (Z‘i(*i,Cl)  -  z,  Zlfm(i7,c2)  -  x  |  Z{£( 0,0)  =  i] 

=  2^a(t,0)  min(»i,i2)  +  cjc2a  +  o(l), 


where 


{[1  -  6t(  1  -  t)]sin2fl  -  y/3(2 1  -  f)cos0sinfl  +  (l/2)}a 


£>(t,t)  =  1  -  3<(1  -  t). 


(3.10) 


Proof  :  These  results  follow  from  straightforward  calculations.  | 
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Lemma  3.3.4. 


Fix  n,  h,  and  a.  Then  for  each  (t,0)  there  is  a  constant  Ho.(t,(t,n,h i)  <  oo  such  that, 
if  b  — ►  oc,  m  — ►  oo,  and  b2/m  — *  a,  then 

Pr{  max  sup  Zm(t  +  i/m,  0  4-  c/y/m)  >  -*•  1  +  Ha(t,  0,  n,  h), 

0<*<n0  <c<h  o 

where 

r° 

Ha(t,0,n,h)  =  /  exp( -x)  Pr{  max  yat,e(t)  +  sup  Sa(c)  >  -x}dx, 

J-oo  °<'< »  0<c<h 

t  0 

and  F0’  (i)  is  a  partial  sum  of  i.i.d.  normal  random  variables  with  mean  -pa{t,0) 
and  variance  2/io(t,0), 

Sa(e)  =  Cy/aSi  -  c2a/ 2  with  5j  ~  .V(0, 1) 

and  {Ka'^(j)}  and  {S„(c)}  are  independent. 

Proof :  By  the  previous  lemma,  the  limiting  process  can  be  represented  as 

Y*'e(i)  +  5q(c)  =  +  [cv'SS,  -  c2a/2], 

where  W  is  a  standard  Brownian  motion  and  <r2 (t,0)  =  2fia(t,0).  Then,  following  the 
same  argument  as  in  Lemma  12.2.3  of  Leadbetter,  Lindgren,  and  Rootzen  (1983), 

Pr{  max  sup  Zm(t  +  i/m,  6  +  c/y/m)  >  — ►  1  +  Ha(t,  0,  n,  h), 

where  c  takes  real  values.  I 


Lemma  3.3.5. 


For  each  ( t,0 ),  there  exists  a  function  H’(t,0)  such  that 


lim  f?a(<,0,  n,h)/(nh)  =  H’(t,0)  uniformly  in  t  and  0. 


As  6  — *  oc  and  m  — *  oo, 


Pr{  max  sup 

t0<i/fn<«i  0<c/>/7n  <2jt 


sup  Zm(i/m, c/y/m)  >  b}/  \b2<j>(b)] ->  !  [  H’(t,  0 )dOdt/ 

/*/rn<2ir  ■'<0  •'O 
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Proof  :  Let  n  and  h  be  fixed  integers  and 


Bk,i  =  {  max  sup  Zm(i/m,c/y/m)  >  6} 

*n<i<(*+l)n  lh<c<(l+\)h 


=  {  max  sup  Zm((kn  +  i)/m,(lh  + c)/'Sm)  >  b) 

0<><"0  <c<h 


Then  it  can  be  shown  that 


L\ 


Pr{  max  sup  Zm(i/m,c/v/m)  >  &}  ~  P{Bk,i}, 

to<«/m<«,  0<c/ s/m<2r  *=A0  i=Lo 


= A  o  l—L>o 

where  A*on  =  mo,  K\n  =  mi,  Lo  =  0,  Li  =  2Ti/m//i,  ^i  —  Ao  =  [m/nj,  an<^  L\  —  Lo- 
[2xy/Tn/h\. 

Now  Lemma  3.3.4  implies  that 


K,  Li  A'!  A,  __ 

2  P{£u}  ~  [d>^)/*>]  H  i1  +  (kn/m,lh/y/m,n,h)] 

k—Ko  l~Lo  k—Ko  1=Lq 

L\ 

~  62d>(6)[27r  +  ^  ^  Ha(kn/mjh/y/rn,n,h)\/(nhay/a). 
k=K0  l=Lo 


Therefore 


Pr{  max  sup  Zm(i/m,c/y/m)  >  6}/[62d»(6)] 
<0<'/m<fj  0<c/v^r<2* 


~  lim  /  /  Ha{t,9,n,h)d6dt/(nha,/a) 

A)  -/o 

=  P  /  '  n;(t.6)d9dt/(a^), 

Jto  Jo 


which  completes  the  proof.  | 


Lemma  3.3.6. 


For  each  fixed  (t,  0). 

ftl  /'2* 


P  /  *  n;(t,9)d8dt/J  =  P  [  Ha(t,8)v{2pa{t,e)}  d6dt  /a*, 

Jto  Jo  Jto  J  d 

where  ,ua(t,0)  was  defined  in  (3.10)  and  nl(t,6)  =  {pa{t,0)/2}$ . 

Proof  :  Note  that 

JTo(t.0,n,/i)=  /  exp(x)  Pr{  max  V0<,fl(»)  +  sup  Sa(c)>x}dx, 

y0  0<i<»  0<c<h 


(3.1 
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i  A 

where  Ya'  (i)  and  5a(c)  have  the  same  representation  as  in  Lemma  3.3.4. 
Let 


Then 


Pr{  max  Va'*(i)  +  sup  Sa(c)  >  x}  =  1  -  R{x). 

°<><"  0<c< A 


Ha(t,0,n,h)=  f  exp(x)[l  -  R(x)]dx 
Jo 

=  J  exp(x)  J  dR{y)dx 
=  f  /  exp (x)dxdR(y) 

=  f  e*p(y)dR(y)  -  i 

Jo 

roc  too 

=  /  exp (y)dF{y)  exp(y)g(y)dy  -  1 
Jo  Jo 

~{Jo  exp^H1  -  F(y)]<ty  +  l){Jo  exp(y)y(y)dy}  -  1, 


1  -  F(y)  =  Pr{  max  Y^e(i)  >  y} 

0  <*<n 

1  -  G(y )  =  Pr{  sup  S„(c)  >  y). 

0<c<h 

By  the  same  argument  as  in  Lemma  2.3.4,  as  n  — •  oo, 

/  exp(y)[l  -  F(y)]dy/n  —  /xa(t,0)i/[2^(t,0)]. 
Jo 

And  it  can  be  shown  that,  as  h  — *  oo, 

/  exp(y)y(y)dy//i—  {a/(2rr)}^, 

Jo 


where 


using 


f 


1/2,  if  y  =  0 

g(y)  =  ^  d>(v/5y)/v^y  if  0  <  y  <  h7a/2 

{  4>(y/(hy/a)  +  hy/a/2)/(hy/a)1  if  y  >  h7a/2. 


Then 


Ha(t,0.n,  h)/(nha$)  —  pa(t,9)i/[2p’a(t,9)]/(aV2x)' 
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Theorem  3.3.7. 

Assume  that  6  — *  ex,  m0  — >  oc,  mx  — ►  oo,  and  m  — *  oo  in  such  a  way  that  for  some 
0  <  to  <  <  1  and  a  >  0 

m,/m  —  f,,  i  =  0,1  and  b2/m  — *  a. 

Then  as  m  — •  oc, 

Pr{  max  ||Vm(t/m)||  >  6}  ~  62d>{6)  /  /  i/[2/£(M)]/i«(t,0)<ft/(av^), 

mo<<<">!  y»0  Jo 

(3.12) 

where  pa(.t,0)  is  defined  in  (3.10)  and  pl(t,0)  is  defined  in  (3.11). 

Table  8  indicates  the  accuracy  of  (3.12).  From  these  numerical  results,  it  can  be 
confirmed  that  (3.12)  is  quite  an  accurate  approximation  to  the  significance  level  of  M3 
and  also  gives  a  reasonable  approximation  to  the  tail  probability  of  the  null  distribution 
of  M4.  In  the  rest  of  this  section,  we  generalize  Theorem  3.3.7  to  the  values  of  the  x/s 
which  satisfy  some  mild  conditions.  Proofs  will  be  omitted  since  they  follow  closely  those 
of  the  previous  theorem. 

Lemma  3.3.8. 

Suppose  that  x}  -  f(j/m),  j  =  l,...,m,  for  some  integrable  function  /  such  that 
/( 0)  =  0  and  /( 1 )  =  1. 

Let  Zm(i/m,&)  =  co&0V\,m(i/m)  +  sin0V2tm(i/m),  and 

Z^(*,c)  ~  KZm(t  +  i/m,0  +  c/y/m)  -  b),  where  b2/m  =  a. 

Then  as  m  — *  oc  and  b  —  00. 

E[Zlfji, c)  -  *|z£(0,0)  =  x]  =  -pa{t, 0)i  -  ac2/ 2  +  <*1), 

Cov[Z&(ii,c,)  -  x,  ZJi(i2,e2)  -  x|ZjJ,(0,0)  =  x] 

=  2pa(t,  0)  min(i'i,  t'2)  +  cic2a  +  o(l), 

where 

Pa(U)  =  {1  ![ta  -  t)]  +  sin2(0).4,(t)  -  cos 0 sin 0  A2{t)}a /[2D(t)\  (3.13) 
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A  (t)  _  J>'(0(fl(0l2  +  2 h(t)g{t)[h(t)g(t)  -  tD(t)E(t)  -  [/i(Q2g(t)P(t)] 
U  '  h(t)(D(t)  -  h(t)) 

D(t) 

«(!-<)’ 

Aa(«)  =  2[M«)ff(0  -  tz>(i)£?(0]/{t(i  -  <)M0(£(0  -  m\}K 
D(t)=  l-g2{t)t(l-t), 

E{t)  =  g(t)  -  (1  -  t)fi'(t). 

Proof  :  A  straight  forward  calculation  suffices.  | 


Theorem  3.3.9. 

Suppose  that  x j  =  /(j/m),  j  =  for  some  integrable  function  /  such  that 

/(0)  =  0  and  /(l)  =  1. 

Assume  that  6  — *  oo,  mo  — »  oc,  mj  — *  oo,  and  m  — *  oo  in  such  a  way  that  for  some 
0  <  t0  <  ti  <  1  and  a  >  0 

m,/m  —  t,,  i  ~  0. 1  and  b2/m  — ►  a. 


Then  as  m  —  oo. 

Pr{  max  ||Vm(t/ m)||  >  6}  ~  62d>(6)  f  f  v[2p‘a{t,O)]pa(t,0)d0dt/(a\/2T:), 
mo  <i<mi  yto  y0 

(3.14) 

where  pa(t,0)  is  defined  in  (3.13)  and  pl(t,9)  =  {pa(t,0)/ 2}i. 


In  the  case  of  x3  =  f{j/m),  pa(t,0)  involves  two  different  functions  h  and  g  through 
which  the  distribution  of  the  test  statistic  depends  on  the  configuration  of  the  x/s.  As  a 
matter  of  calculation,  this  case  is  more  complicated  than  the  case  of  the  uniformly  spaced 
z/s.  However  previous  Monte  Carlo  experiments  lead  us  to  expect  (3.14)  to  be  quite 
good  approximations.  In  this  chapter,  we  have  not  considered  powers  and  confidence 
regions.  For  a  confidence  region  of  the  change  point  the  method  of  Cox  and  Spijdtvoll 
(1981)  can  be  used,  and  the  argument  in  Section  2.4  might  lead  us  to  a  generalization  of 
approximations  to  powers  and  confidence  regions. 


Chapter  4 


Concluding  Remarks 


As  discussed  in  Chapter  1,  the  exact  null  distributions  of  most  of  the  likelihood  ratio 
statistics  are  too  complicated  to  deal  with.  Most  of  previous  works  have  been  done  by 
numerical  or  Monte  Carlo  methods,  e.g.  Quandt  (1958),  Beckman  and  Cook  (19791, 
Maronna  and  Yohai  (1978),  etc.  An  analytic  approach  was  taken  by  Worseley  (1983) 
who  derived  approximations  to  upper  bounds  of  the  null  distribution  functions  of  the 
likelihood  ratio  statistics. 

An  important  characteristic  of  the  tests  considered  in  Chapters  2  and  3  is  that  they 
involve  Gaussian  processes.  Using  methods  developed  to  solve  boundary  crossing  prob¬ 
lems  by  a  Gaussian  process  we  derived  quite  accurate  approximations  to  significance  levels 
in  various  cases.  The  models  that  we  studied  are  simple  linear  regression  models.  Al¬ 
though  we  do  not  consider  more  complicated  models  and  related  problems  like  confidence 
regions  in  general  cases,  this  dissertation  may  give  some  insight  into  those  problems.  Note 
that  in  both  (2.16)  and  (3.12),  b<t>(b)  Jf*  v[2nl(t,-)]na(t,-)dt/a  accounts  for  the  boundary 
crossing  probabilities  by  the  given  Gaussian  processes  with  respect  to  time  and  the  in¬ 
tegration  with  respect  to  the  angle  0  is  involved  in  (3.12)  basically  because  of  the  angle 
parameter  introduced  to  reduce  the  2-dimensional  problem  to  the  one-dimensional  case. 
This  comparison  may  lead  to  a  generalization  of  our  results.  In  testing  for  a  change  in 
the  coefficient  of  the  multiple  regression  model,  the  MLRS  is  the  maximum  norm  of  a  d- 
dimensional  Gaussian  process.  By  the  same  argument  in  Lemma  3.3.1,  we  can  convert 
this  boundary  crossing  problem  by  a  d-dimensional  Gaussian  process  to  a  one-dimensional 
problem  with  additional  angle  parameters.  Thus,  in  general,  once  the  covariance  func¬ 
tion  of  the  Gaussian  process  is  evaluated,  a  similar  argument  may  be  applied  to  find 
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asymptotic  expressions  to  approximate  significance  levels.  In  our  models,  the  change 
point  is  assumed  to  be  one  of  the  data  points.  Thus  our  model  might  be  suitable  to  a 
set  of  data  which  involves  discrete  time  such  as  annual  gross  domestic  product,  number  of 
accidents  in  consecutive  years,  and  so  on.  Hinkely  (1971)  studied  a  set  of  data  obtained 
from  the  experiment  to  determine  the  relationship  between  blood  factor  VII  production 
and  wafarin  concentration.  In  such  a  case,  it  is  more  reasonable  to  consider  a  continuous 
model  that  a  change  occurs  at  some  point  in  the  range  of  the  independent  variables  and 
two-phase  regression  line  is  continuous.  Also  this  example  gives  a  good  explanation  why 
we  need  to  think  about  the  two-phase  regression  model  rather  than  some  alternative  such 
as  parabolic  one. 

In  many  cases  a  two-phase  regression  can  only  be  a  reasonable  approximation,  ade¬ 
quate  for  many  purposes.  However  it  is  also  important  to  find  an  appropriate  model.  As 
Beckman  and  Cook  (1979)  pointed  out  by  example,  the  continuity  assumption  may  lead 
to  very  different  estimates  of  the  parameters.  The  choice  of  the  model  is  to  some  extent  a 
matter  of  experience  and  common  sense.  Even  though  the  model  should  be  decided  from 
the  biological,  economic,  or  some  other  particular  viewpoint,  our  model  can  be  applied 
to  give  some  insight  into  the  decision  of  a  change  in  the  regression  relationship  and  our 
approximations  can  be  used  as  convenient  standards. 


\  «  *  •  ' 
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Table  1.  Approximations  to  Pr{A/j  >  6} 


When  Only  the  Intercept  Changes 


( 


) 


m 

b 

Pi 

P2 

True  probability 

10 

1.9571 

0.2656 

0.25 

2.3854 

0.2393 

0.1037 

0.10 

2.6595 

0.1340 

0.0511 

0.05 

3.1492 

0.0384 

0.0118 

0.01 

15 

2.0171 

0.4522 

0.2996 

0.25 

2.4412 

0.2142 

0.1178 

0.10 

2.7224 

0.1159 

0.0568 

0.05 

3.2220 

0.0312 

0.0126 

0.01 

20 

2.0632 

0.4215 

0.2657 

0.25 

2.4733 

0.2006 

0.1080 

0.10 

2.7321 

0.1133 

0.0556 

0.05 

3.2963 

0.0250 

0.0102 

0.01 

30 

2.1190 

0.3856 

0.2634 

0.25 

2.5253 

0.1800 

0.1077 

0.10 

2.8006 

0.0961 

0.0529 

0.05 

3.3963 

0.0185 

0.01 

40 

2.1487 

0.3672 

0.25 

2.5598 

0.1672 

0.1068 

0.10 

2.8429 

0.0866 

0.0518 

0.05 

3.3241 

0.0230 

0.0121 

0.01 

70 

2.2092 

0.3313 

0.2602 

0.25 

2.6274 

0.1441 

0.1027 

0.10 

2.9131 

0.0726 

0.0487 

0.05 

3.4527 

0.0155 

0.0093 

0.01 

V.W.v 


Table  3.  Approximations  to  Pr{M,  >  6,}, 


When  Only  the  Intercept  Changes 
X]  ~  j'/mt  ;  = 


m 

bi 

P 

b2 

(p') 

True  probability 

10 

2.0756 

1 

2.2625 

(0.1736) 

0.25 

2.4519 

2.4582 

(0.0941) 

0.10 

2.7157 

2.6236 

(0.0673) 

0.05 

3.2747 

2.8425 

(0.0314) 

0.01 

15 

2.1695 

2.3093 

(0.1931) 

0.25 

2.5812 

2.6036 

(0.0897) 

0.10 

2.8500 

2.7647 

(0.0565) 

0.05 

3.3413 

3.0455 

(0.0236) 

0.01 

20 

2.2315 

2.3395 

(0.2136) 

0.25 

2.6245 

0.1067 

2.6553 

(0.0946) 

0.10 

2.8665 

0.0539 

2.8388 

(0.0559) 

0.05 

3.3924 

0.0098 

3.1921 

(0.0183) 

0.01 

30 

2.3271 

2.3939 

(0.2314) 

0.25 

2.7133 

2.7332 

(0.0969) 

0.10 

2.9598 

2.9517 

(0.0516) 

0.05 

3.4829 

0.0093 

3.3466 

(0.0144) 

0.01 

40 

2.3632 

0.2846 

2.4154 

(0.2498) 

0.25 

2.7509 

0.1083 

2.7605 

0.10 

3.0144 

2.9825 

0.05 

3.5135 

ms 

3.4327 

0.01 

70 

2.4357 

2.4635 

(0.2736) 

0.25 

2.8348 

0.1078 

2.8328 

0.10 

3.1037 

0.0497 

3.0712 

0.05 

3.5847 

0.0102 

3.5166 

(0.0128) 

0.01 

mo  =  0.1  *  m,  mj  =  0.9  *  m 
bj  :  Percentiles  of  A/j  (a7  is  known) 

p  :  Approximations  by  (2.16) 
b2  :  Percentiles  of  A/2  (<r2  is  unknown) 
p' :  Approximations  by  (2.16) 


Table  4.  Approximations  to  Pr{Af,  >  b} 


:  x j  =  j/m,  j  =  l,...,m. 


0.3150 

0.2829 

0.2534 

0.2263 

0.2015 

0.1789 

0.1584 

0.1398 

0.1231 

0.1081 

0.0946 

0.0826 

0.0719 

0.0625 

0.0541 

0.0468 

0.0403 

0.0346 

0.0297 

0.0254 

0.0216 

0.0184 

0.0156 

0.0132 

0.0111 

0.0094 

0.0079 

0.0066 

0.0055 

0.0046 

0.0038 

0.0032 

0.0026 

0.0026 

0.0018 


0.4569 

0.4136 

0.3733 

0.3359 

0.3014 

0.2696 

0.2404 

0.2138 

0.1896 

0.1676 

0.1478 

0.1300 

0.1139 

0.0996 

0.0868 

0.0755 

0.0655 

0.0566 

0.0488 

0.0420 

0.0360 

0.0308 

0.0263 

0.0223 

0.0190 

0.0160 

0.0135 

0.0114 

0.0096 

0.0080 

0.0067 

0.0056 

0.0046 

0.0038 

0.0032 


0.5416 

0.4922 

0.4460 

0.4028 

0.3628 

0.3257 

0.2916 

0.2603 

0.2317 

0.2056 

0.1819 

0.1605 

0.1412 

0.1239 

0.1084 

0.0946 

0.0823 

0.0714 

0.0618 

0.0533 

0.0459 

0.0394 

0.0337 

0.0288 

0.0245 

0.0208 

0.0176 

0.0149 

0.0125 

0.0105 

0.0088 

0.0073 

0.0061 

0.0051 

0.0042 


0.6002 

0.5468 

0.4966 

0.4496 

0.4059 

0.3653 

0.3278 

0.2933 

0.2616 

0.2327 

0.2064 

0.1825 

0.1610 

0.1416 

0.1241 

0.1085 

0.0946 

0.0823 

0.0714 

0.0617 

0.0532 

0.0458 

0.0392 

0.0336 

0.0286 

0.0244 

0.0207 

0.0175 

0.0148 

0.0124 

0.0104 

0.0087 

0.0073 

0.0061 

0.0050 


0.6441 

0.5877 

0.5347 

0.4849 

0.4385 

0.3953 

0.3553 

0.3184 

0.2845 

0.2535 

0.2252 

0.1995 

0.1762 

0.1552 

0.1363 

0.1194 

0.1042 

0.0908 

0.0788 

0.0683 

0.0590 

0.0508 

0.0436 

0.0374 

0.0319 

0.0272 

0.0231 

0.0196 

0.0165 

0.0139 

0.0117 

0.0098 

0.0082 

0.0068 

0.0057 


Table  4.  (Continued) 


0.6787 

0.6201 

0.5648 

0.5129 

0.4643 

0.4191 

0.3772 

0.3384 

0.3028 

0.2701 

0.2402 

0.2131 

0.1884 

0.1662 

0.1461 

0.1281 

0.1120 

0.0977 

0.0849 

0.0736 

0.0637 

0.0549 

0.0472 

0.0405 

0.0346 

0.0295 

0.0251 

0.0213 

0.0180 

0.0152 

0.0128 

0.0107 

0.0090 

0.0075 

0.0062 


.7069 
.6465 
.5895 
.5358 
.4856 
.4387 
.3952 
.3550 
.3179 
0.2838 
0.2527 
0.2243 
.1986 
.1753 
.1543 
0.1354 
0.1185 
0.1034 
0.0900 
0.0781 
0.0676 
0.0584 
0.0502 
0.0431 
.0369 
.0315 
.0268 
.0228 
.0193 
0.0163 
0.0137 
0.0115 
0.0097 
0.0081 
0.0067 


1 


.7306 

.6687 

.6102 

.5551 

.5034 

.4552 

.4104 

0.3689 

0.3306 

0.2954 

0.2632 

.2339 

.2072 

.1830 

.1612 

0.1416 

0.1240 

0.1083 


0.0710 

0.0613 

0.0528 

0.0454 


0.0283 

0.0240 


0.0145 

0.0122 

0.0102 

0.0085 

0.0071 


.7508 

.6877 

.6280 

.5716 

0.5188 

0.4694 

0.4234 

0.3809 

0.3416 

0.3054 

0.2723 

0.2421 

0.2146 

0.1897 

0.1672 

0.1470 

0.1288 

0.1126 


0.0639 

0.0551 

0.0473 


0.0251 

0.0213 

0.0180 

0.0152 

0.0128 

0.0107 


0.1  *  m 


0.9  *  m 


Critical 

value 


2.8253 


(10%) 


2.7509 


(10%) 


Table  8.  Approximations  to  Pr{A/,  >  6,},  i  =  3,4 


When  Both  the  Intercept  and  Slope  Change 

x:  =  j/m,  j  = 


m 

b3 

P 

b4 

(p0 

True  probabilitv 

10 

2.3440 

0.2917 

2.5149 

(0.2186) 

0.25 

2.7557 

0.1078 

2.7157 

(0.1352) 

0.10 

3.0350 

0.0488 

2.8207 

(0.1032) 

0.05 

3.5315 

0.0095 

2.9741 

(0.0679) 

0.01 

20 

2.5893 

0.2746 

2.6871 

(0.2287) 

0.25 

2.9803 

0.1015 

2.9628 

(0  1134) 

0.10 

3.2348 

0.0481 

3.1387 

(0.0691) 

0.05 

3.7381 

0.0088 

3.4394 

(0.0273) 

0.01 

30 

2.6711 

0.2877 

2.7411 

(0.2510) 

0.25 

3.0584 

0.1065 

3.0619 

(0.1098) 

0.10 

3.3040 

0.0516 

3.2483 

(0.0642) 

0.05 

3.7634 

0.0110 

3.6044 

(0.0206) 

0.01 

40 

2.7121 

0.3025 

2.7772 

(0.2649) 

0.25 

3.1058 

0.1100 

3.1156 

(0.1101) 

0.10 

3.3700 

0.0502 

3.3256 

(0.0596) 

0.05 

3.8502 

0.0098 

3.7341 

(0.0156) 

0.01 

50 

2.7636 

0.2971 

2.8077 

(0.2717) 

0.25 

3.1519 

0.1122 

3.1496 

(0.1115) 

0.10 

3.3863 

0.0504 

3.3647 

(0.0593) 

0.05 

3.8851 

0.0099 

3.7778 

(0.0152) 

0.01 

m0  =  2.  mi  =  8  for  m  =  10 

mo  =  0.1*m,  mi  =  0.9  *  m  form  >20 


b3  :  Percentiles  of  A/3  (<r2  is  known) 
p  :  Approximations  by  (3.12) 
b4  :  Percentiles  of  A/4  (o2  is  unknown) 
p'  :  Approximations  by  (3.12) 


Appendices 

A.l.  Basic  Facts  about  Convergence  of  Probability  Measures 

Convergence  in  distribution  of  a  sequence  {A"n}  of  real  random  variables  is  tradi¬ 
tionally  defined  to  mean  convergence  of  distribution  functions  at  each  continuity  point  of 
the  limit  distribution  function.  For  random  elements  of  more  general  spaces  not  equipped 
with  a  partial  ordering,  even  the  concept  of  distribution  function  disappears.  In  Chapter 
1  of  Billingsley  (1968),  convergence  in  distribution  for  a  sequence  of  random  elements  was 
summarized  and  now  we  define  the  convergence  in  distribution  for  random  elements  using 
his  results. 

Let  C  =  C[ 0, 1]  be  the  space  of  continuous  functions  on  [0,1],  where  we  give  C  the 
uniform  topology  by  defining  the  distance  between  the  points  x,  y  as 

d(x,y)~  sup  |  x(t)  —  y(t)  |. 

0<‘<1 

Chapter  2  of  Billingsley  (1968)  contains  a  theory  about  the  weak  convergence  in  the  space 
C  which  is  used  in  this  dissertation.  Here,  we  include  a  brief  review  of  definitions  and 
theorems  which  are  basic  and  important. 

Suppose  now  that  {.Yn}  is  a  sequence  of  random  elements  in  C.  That  is,  for  each  u  in 
SI,  Xn(u>)  is  an  element  of  C  whose  values  at  t  we  denote  by  Xn{t,oj).  For  points  t\,  ...,/*  in 

[0,  l],let  7Tt, . tk  be  the  mapping  that  carries  the  point  x  ofC  to  the  point  (x(<j), . .  ,,x(t*)) 

of  Rk.  The  finite  dimensional  sets  are  now  defined  as  sets  of  the  form  tr,”  tk  H  with  H  6  Rk 

and  the  finite  dimensional  distribution  of  Xn  as  that  of  trtl . tkX .  Since  the  space  of  Borel 

sets  of  C  with  the  uniform  metric  is  separable  and  complete,  the  finite  dimensional  sets 

generate  the  space  of  Borel  sets.  However,  the  convergence  in  distribution  of  7rtl . (kX 

does  not  imply  the  convergence  of  Xn  in  distribution.  The  difficulty  and  interest  of  weak 
convergence  in  C  all  >  ome  from  the  fact  that  it  involves  considerations  going  beyond  those 
of  finite  dimensional  sets.  Here  is  an  idea  which  provides  a  powerful  technique  for  proving 
weak  convergence  in  C.  If  every  sequence  of  Xn  contains  a  subsequence  which  converges  in 
distribution,  then  Xn  converges  in  distribution.  In  the  space  C  this  condition  is  equivalent 
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to  “tightness”  which  is  a  condition  that  has  the  effect  of  preventing  the  escape  of  mass  to 
infinity  in  a  certain  sense.  Now  we  define  tightness  of  a  sequence  of  random  elements  as 
follows:  A'„  is  tight  if  A'n(O)  is  tight  on  line  and  if  for  each  positive  e  and  tj  there  exists  a 
6,  o  <  6  <  1,  and  an  integer  no  such  that 

jPr{  sup  \Xn(s)~  Xn(t)\>e}  <ti 

<>  t<i<t+6 

for  n  >  no  and  0  <  t  <  1.  Then  we  have  the  following  result. 

Theorem  A. 1.1 

Let  X,  be  random  elements  of  C.  If  the  finite  dimensional  distributions 

of  Xn  converges  to  those  of  X,  and  if  {ATn}  is  tight,  then  X„  =>  X. 

To  obtain  the  limiting  distributions  of  the  test  statistics  defined  in  Sections  2.1  and  3.1 
when  the  independent  variables  are  random,  Donsker’s  theorem  was  used  as  an  important 
tool.  Donsker  formulated  a  refinement  of  the  central  limit  theorem  by  proving  weak 
convergence  of  the  distributions  of  certain  random  functions  constructed  from  the  partial 
sum. 

Theorem  A. 1.2  (Donsker) 

Let  yi,  j/2, .  • .  be  i.i.d.  random  variables  with  mean  0  and  finite,  positive  variance  <72, 
and  let  5„  =  yi  +  . . .  +  y„ .  Define  a  random  element  Xn  of  C  by 

Xn(t,<+>)  =  ^-^=yS(nt](u>)  +  (nt  -  [n<])^-/=yy[n«]+l(w)- 

Then  as  n  — ►  oc,  Xn  converges  to  a  Brownian  motion  process  in  distribution. 

A. 2.  Applications  of  Boundary  Crossing  Probabilities  to  Change-Point  Prob¬ 
lems 

Methods  developed  to  approximate  boundary  crossing  probabilities  in  fixed  sample 
statistical  problems  provide  an  important  tool  in  this  dissertation.  Especially,  the  results 
in  Siegmund  (1986)  and  James,  James,  and  Siegmund  (1987)  developed  for  change-point 
problems  were  used  to  approximate  the  significance  levels  of  the  modified  likelihood  ratio 
statistics  defined  in  Sections  2.1.  and  3.1.  The  above  two  papers  are  concerned  with 


the  problem  of  testing  a  sequence  of  normal  random  variables  with  constant,  known  or 
unknown,  variance  for  no  change  in  mean  versus  alternatives  with  a  single  change- point. 

Let  arj, . .  .  ,xm  be  independent  random  variables  and  consider  the  case  where  the  x^s 
are  normally  distributed  with  mean  and  constant  variance.  One  specific  problem  is 
to  test 

Ho  :  =  ■  ■  ■  =  against 

H i  :  31  <  p  <  m  such  that 

When  the  variance  is  known,  Siegmund  (1986)  suggests 

max  |  -  &5m/m  |/{fc(l  -  fc/m)}*  (A. 2.1) 

mQKkKrrii 

as  a  test  statistic  and  derives  an  approximation  to  the  significance  level  of  the  test  based 
on  (A.2.1).  As  an  application  of  the  theories  of  weak  convergence  of  stochastic  processes  , 

Pr{  max  |  5*  -  kSm/m  J/{fc(l  -  k/m)}*  >  b)  (A. 2. 2) 

can  be  approximated  by  the  corresponding  probability  defined  in  terms  of  a  Brownian 
motion  process  VT(t)  (0  <  /  <  oc).  That  is,  (A. 2. 2)  is  approximately 

Pr{  | W0( t )  |  >  b{t(  1  -  <)}?  for  some  £i  <  t  <  1  —  £2} 

(-4.2.3) 

=  (b-b~')d>(b)\o g[(l  -  £i)(l  -  £i)/eie2]  +  4b  1  </>(&)  +  o(b~l <b(b)), 

which  is  given  in  Siegmund  (1986).  The  following  theorem  given  in  Siegmund  (1986) 
provides  an  approximation  to  the  significance  level  of  the  test  statistic  (A.2.1)  ,  taking 
discreteness  into  consideration. 

Theorem  A.2.1. 

Assume  that  6  — >  00,  mo  — >  oc,  m  —  00  in  such  a  way  that  for  some  0  <  t0  <  <1  <  1 
and  60  >  0 

m,/m  — »  f , ,  t  =  0, 1  and  b/y/m  =  b0. 

Let  £  =  m£0  for  some  |£ol  €  (60(  1  -  <i){<o(l  -  <o)}^^o{#i(l  -  <1)}*)- 


■c 


v 


•-  XT  <.*  ^  T* 


Then  as  m  — ►  oc, 

Pr{  max  |  Sk  -  kSm/m  |/{Ar(  1  -  k/m)}$  >  6} 

m0<Km, 

( A.2.4 ) 

=  2b<p{b)  /  .  x~lv{x  +  b2/mx)dx  +  2[l  -  *(6)1, 

-m-1 )  I 

where  v  is  given  by  (2.10). 

In  the  case  of  unknown  and  constant  variance,  James,  James,  and  Siegmund  (1987) 
considered  the  statistic, 

[■  |  S*  -  kSm/m  |  f  .  s2,_i) 

max  - r  { m  >  (xn  -  xm)2}  2 

m0<t<m,  L^(1  _  Ic/m)}?  1  “  ;  . 

and  provides  the  following  approximation  which  can  be  used  to  approximate  the  signifi¬ 
cance  level. 

Corollary  A. 2. 2 

Under  the  same  assumptions  as  in  Theorem  A.2.1, 


__  .  \\Sk~kSm/m\[  ,  jl 

Pr{  max  - r{m  >  (x„  -  xm)2}~2  >6} 

*2{m/(27r)}$  \  (1  -  x2){m~4)/2dx 
JtH, 

+  (2/n)H(l  -  62/rn)(m-4)/2  J  x~xv[x  -f  62/{m(l  -  bl)x}}dx , 
where  the  second  integral  on  the  right  side  is  over 

(6{(m^  -  m_1)/(l  -  ^)}j,6{(mo1  -  m-1)/(l  -  b 2)}?). 


(A.2.5) 
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20.  ABSTRACT. 


This  dissertation  focuses  on  the  problem  of  testing  for  a  change  in  the  regression 
model  when  errors  are  independently,  normally  distributed  with  constant,  known  or  un¬ 
known  variance.  First  we  consider  the  regression  model  in  which  only  the  intercept  changes 
at  some  unknown  point  (Model- 1).  Secondly,  the  model  in  which  both  intercept  and  slope 
change  is  considered  (Model-2).  In  all  cases,  the  likelihood  ratio  statistic  (LRS)  is  of  the 
form  U  =  maxi<«m  U{,  where  distributions  of  17, ’s  vary  according  to  the  assumptions. 

In  both  models,  we  consider  the  likelihood  ratio  test  (LRT)  as  the  problem  of  the 
boundary  crossing  by  the  discrete  stochastic  process  and  study  problems  such  as  approx¬ 
imations  to  significance  levels,  powers,  and  confidence  regions  for  a  change  point.  First  of 
all,  we  propose  a  modified  LRT  and  discuss  asymptotic  properties  of  test  statistics  in  cases 
of  random  and  fixed  independent  variables.  In  both  cases,  we  derive  analytical  approxi¬ 
mations  to  significance  levels.  When  the  independent  variables  are  random,  the  limiting 
distribution  of  the  modified  LRS  is  a  function  of  a  Brownian  motion  and  approximations 
in  Siegmund  (1986,  Annals  of  Statistics)  are  used.  For  fixed  independent  variables,  the 
limiting  distribution  involves  a  Gaussian  process  with  nondifferentiable  sample  paths.  In 
this  case,  an  approximation  is  derived  assuming  the  known  variance  and  mild  conditions 
about  the  empirical  distribution  of  the  independent  variable,  using  the  argument  in  Lead- 
better,  Lindgren  and  Rootzen  (1983,Chapterl2),  modified  for  discrete  time  by  Hogan  and 
Siegmund  (1986,  Advances  in  Applied  Mathematics).  In  Model-1,  we  are  also  concerned 
with  the  power  of  the  LRT  and  confidence  regions  for  a  change  point. 

Numerical  approximations  of  significance  levels  and  powers  of  the  LRT  and  the  results 
of  corresponding  Monte  Carlo  experiments  are  obtained.  We  find  that  the  simulations 
confirm  that  the  theoretical  results  perform  well  and  demonstrate  that  the  results  also 
can  be  applied  to  the  unknown  variance  case. 


